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Preface 



This volume contains the proceedings of the 7th International Symposium on 
Functional and Logic Programming (FLOPS 2004), held in Nara, Japan, April 
7-9, 2004 at the New Public Hall, Nara. 

FLOPS is a forum for research on all issues concerning functional program- 
ming and logic programming. In particular it aims to stimulate the cross-fertiliza- 
tion as well as the integration of the two paradigms. The previous FLOPS meet- 
ings took place in Fuji-Susono (1995), Shonan (1996), Kyoto (1998), Tsukuba 
(1999), Tokyo (2001) and Aizu (2002). The proceedings of FLOPS 1999, FLOPS 
2001 and FLOPS 2002 were published by Springer- Verlag in the Lecture Notes 
in Computer Science series, as volumes 1722, 2024 and 2441, respectively. 

In response to the call for papers, 55 papers were submitted by authors from 
Australia (1), Austria (1), Canada (1), China (4), Denmark (2), Estonia (^), 
France (3^), Germany (4|), Italy (1), Japan (15), the Netherlands (1), Oman 
(1), Portugal (i), Singapore (2), Spain (8), UK (3), and USA (6|;). Each paper 
was reviewed by at least three program committee members with the help of 
expert external reviewers. The program committee meeting was conducted elec- 
tronically for a period of 2 weeks in December 2003. After careful and thorough 
discussion, the program committee selected 18 papers (33%) for presentation at 
the conference. In addition to the 18 contributed papers, the symposium included 
talks by three invited speakers: Masami Hagiya (University of Tokyo), Carsten 
Schiirmann (Yale University), and Peter Selinger (University of Ottawa). 

On behalf of the program committee, we would like to thank the invited 
speakers who agreed to give talks and contribute papers, and all those who sub- 
mitted papers to FLOPS 2004. As program chairs, we would like to sincerely 
thank all the members of the FLOPS 2004 program committee for their excellent 
job, and all the external reviewers for their invaluable contribution. The support 
of our sponsors is gratefully acknowledged. We are indebted to the Kayamori 
Foundation of Informational Science Advancement, the Japan Society for Soft- 
ware Science and Technology (JSSST), the Association of Logic Programming 
(ALP), and the Asian Association for Foundation of Software (AAFS). Finally 
we would like to thank the members of the local arrangements committee, in 
particular the local arrangements chair Jacques Garrigue, for their invaluable 
support throughout the preparation and organization of the symposium. 
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A Brief Survey of Quantum 
Programming Languages 



Peter Selinger 



Department of Mathematics, University of Ottawa 
Ottawa, Ontario, Canada KIN 6N5 
selingerSmathstat . uottawa . ca 



Abstract. This article is a brief and subjective survey of quantum pro- 
gramming language research. 



1 Quantum Computation 

Quantum computing is a relatively young subject. It has its beginnings in 1982, 
when Paul Benioff and Richard Feynman independently pointed out that a 
quantum mechanical system can be used to perform computations [11, p.l2]. 
Feynman’s interest in quantum computation was motivated by the fact that 
it is computationally very expensive to simulate quantum physical systems on 
classical computers. This is due to the fact that such simulation involves the 
manipulation is extremely large matrices (whose dimension is exponential in the 
size of the quantum system being simulated). Feynman conceived of quantum 
computers as a means of simulating nature much more efficiently. 

The evidence to this day is that quantum computers can indeed perform 
certain tasks more efficiently than classical computers. Perhaps the best-known 
example is Shor’s factoring algorithm, by which a quantum computer can find 
the prime factors of any integer in probabilistic polynomial time [15]. There 
is no known classical probabilistic algorithm which can solve this problem in 
polynomial time. In the ten years since the publication of Shor’s result, there 
has been an enormous surge of research in quantum algorithms and quantum 
complexity theory. 

2 Quantum Programming Languages 

Quantum physics involves phenomena, such as superposition and entanglement, 
whose properties are not always intuitive. These same phenomena give quantum 
computation its power, and are often at the heart of an interesting quantum 
algorithm. However, there does not yet seem to be a unifying set of principles 
by which quantum algorithms are developed; each new algorithm seems to rely 
on a unique set of “tricks” to achieve its particular goal. 

One of the goals of programming language design is to identify and promote 
useful “high-level” concepts — abstractions or paradigms which allow humans 
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to think about a problem in a conceptual way, rather than focusing on the de- 
tails of its implementation. With respect to quantum programming, it is not 
yet clear what a useful set of abstractions would be. But the study of quantum 
programming languages provides a setting in which one can explore possible lan- 
guage features and test their usefulness and expressivity. Moreover, the definition 
of prototypical programming languages creates a unifying formal framework in 
which to view and analyze existing quantum algorithm. 

2.1 Virtual Hardware Models 

Advances in programming languages are often driven by advances in compiler 
design, and vice versa. In the case of quantum computation, the situation is 
complicated by the fact that no practical quantum hardware exists yet, and not 
much is known about the detailed architecture of any future quantum hardware. 

To be able to speak of “implementations” , it is therefore necessary to fix some 
particular, “virtual” hardware model to work with. Here, it is understood that 
future quantum hardware may differ considerably, but the differences should 
ideally be transparent to programmers and should be handled automatically by 
the compiler or operating system. There are several possible virtual hardware 
models to work with, but fortunately all of them are equivalent, at least in 
theory. Thus, one may pick the model which fits one’s computational intuitions 
most closely. 

Perhaps the most popular virtual hardware model, and one of the easiest 
to explain, is the quantum circuit model. Here, a quantum circuit is made up 
from quantum gates in much the same way as a classical logic circuit is made 
up from logic gates. The difference is that quantum gates are always reversible, 
and they correspond to unitary transformations over a complex vector space. 
See e.g. [3] for a succinct introduction to quantum circuits. Of the two basic 
quantum operations, unitary transformations and measurements, the quantum 
circuit model emphasizes the former, with measurements always carried out as 
the very last step in a computation. 

Another virtual hardware model, and one which is perhaps even better suited 
for the interpretation of quantum programming languages, is the QRAM model 
of Knill [9]. Unlike the quantum circuit model, the QRAM models allows unitary 
transformations and measurements to be freely interleaved. In the QRAM model, 
a quantum device is controlled by a universal classical computer. The quantum 
device contains a large, but finite number of individually addressable quantum 
bits, much like a RAM memory chip contains a multitude of classical bits. The 
classical controller sends a sequence of instructions, which are either of the form 
“apply unitary transformation U to qubits i and j” or “measure qubit i” . The 
quantum device carries out these instruction, and responds by making the results 
of the measurements available. 

A third virtual hardware model, which is sometimes used in complexity the- 
ory, is the quantum Turing machine. Here, measurements are never performed, 
and the entire operation of the machine, which consists of a tape, head, and finite 
control, is assumed to be unitary. While this model is theoretically equivalent 
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to the previous two models, it is not generally considered to be a very realistic 
approximation of which a future quantum computer might look like. 

2.2 Imperative Quantum Programming Languages 

The earliest proposed quantum programming languages followed the imperative 
programming paradigm. This line of languages was started by Knill [9], who 
gave a set of conventions for writing quantum algorithms in pseudo-code. While 
Knill’s proposal was not very formal, it was very influential in the design of 
later imperative quantum programming languages. More complete imperative 
languages were defined by Omer [10], Sanders and Zuliani [13], and Bettelli 
et al. [2]. 

A common feature of these imperative quantum programming languages is 
that a program is viewed as a sequence of operations which operate by updating 
some global state. These languages can be directly compiled onto (or interpreted 
in) the QRAM virtual hardware model. Quantum states in this paradigm are 
typically realized as arrays of qubits, and run-time checks are needed to detect 
certain error conditions. For instance, out-of-bounds checks are necessary for 
array accesses, and distinctness checks must be used to ensure i ^ j when 
applying a binary quantum operation to two qubits i and j. As is typical for 
imperative programming languages, the type system of these languages is not 
rich enough to allow all such checks to be performed at compile-time. Also, 
typically these languages do not have a formal semantics, with the exception of 
Sanders and Zuliani’s language, which possesses an operational semantics. 

The various languages in this category each offer a set of advanced pro- 
gramming features. For instance, Omer’s language QCL contains such features 
as, automatic scratch space management, and a rich language for describing 
user-defined operators [10]. It also offers some higher-order operations such as 
computing the inverse of a user-defined operator. 

The language of Bettelli et al. emphasizes practicality. It is conceived as an 
extension of C-|— k, and it treats quantum operators as first-class objects which 
can be explicitly constructed and manipulated at run-time [2] . One of the most 
powerful features of this language is the on-the-fly optimization of quantum 
operators, which is performed at run-time. 

Finally, Sanders and Zuliani’s language qGCL is of a somewhat different 
flavor [13]. Based on Dijkstra’s guarded command language, qGCL is as much a 
specification language as a programming language, and it supports a mechanism 
of stepwise refinement which can be used to systematically derive and verify 
programs. 

2.3 Functional Quantum Programming Languages 

In the functional programming style, programs do not operate by updating a 
global state, but by mapping specific inputs to outputs. The data types associ- 
ated with purely functional languages (such as lists, recursive types) are more 
amenable to compile time analysis than their imperative counterparts (such as 
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arrays). Consequently, even in very simple functional programming languages, 
many run-time checks can be avoided in such languages in favor of compile-time 
analysis. 

The first proposal for a functional quantum programming language was made 
in [14]. In this paper, a language QFC is introduced, which represents programs 
via a functional version of flow charts. The language also has an alternative, 
text-based syntax. Both unitary operations and measurements are directly built 
into the language, and are handled in a type-safe way. Classical and quantum 
features are integrated within the same formalism. There are no run-time type 
checks or errors. The language can be compiled onto the QRAM model, and it 
also possesses a complete denotational semantics, which can be used to formally 
reason about programs. The denotational semantics uses complete partial orders 
of superoperators, and loops and recursion are interpreted as least fixpoints in 
the way which is familiar from domain-theoretic semantics. 

The basic quantum flow chart language of [14] is functional, in the sense of 
being free of side-effects. However, functions are not themselves treated as data, 
and thus the language lacks the higher-order features typical of most functional 
programming languages such as ML or Haskell. It is however possible to extend 
the language with higher-order features. The main technical difficulty concerns 
the proper handling of linearity; here, one has to account for the fact that quan- 
tum information, unlike classical information, cannot be duplicated due to the 
so-called “no-cloning property”. Van Tender, in a pair of papers [16,17], has de- 
scribed a linear lambda calculus for quantum computation, with a type system 
based on Girard’s linear logic [7]. 

Van Tender’s calculus is “purely” quantum, in the sense that it does not 
incorporate classical data types, nor a measurement operation. If one further 
extends the language with classical features and a measurement primitive, then 
a purely linear type system will no longer be sufficient; instead, one needs a 
system with linear and non-linear types. Such a language, with intuitionistic 
linear logic as its type system, will be presented in a forthcoming paper by 
Benoit Valiron. 

We should also mention that there is some interesting work on using func- 
tional languages to simulate quantum computation. For instance, Sabry [12] 
shows how to model quantum computation in Haskell. 



3 Semantics 

The basic flow chart language of [14] has a satisfactory denotational semantics, 
but it lacks many language features that would be desirable, including higher- 
order features and side-effects. In trying to add new language features, one may 
either work syntactically or semantically. Semantic considerations, in particular, 
may sometimes suggest useful abstractions that are not necessarily apparent 
from a syntactic point of view. We briefly comment on some semantic projects. 

Girard [8] recently defined a notion of quantum coherent spaces as a possible 
semantics for higher-order quantum computation. The class of quantum coher- 
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ent spaces has good closure properties, for instance, it forms a *-autonomous 
category. However, the model is still incomplete, because the interpretation of 
quantum languages in this category is currently limited to the “perfect”, or 
purely linear, fragment of linear logic. This means that classical data is subject 
to the same non-duplication restriction as quantum data in this model. 

A different approach to a semantics for higher-order quantum computation 
is given by Abramsky and Coecke [1]. This work is more qualitative in nature, 
and relies on entanglement and quantum measurement to model higher-order 
functions and their applications, respectively. 

Also on the semantic side, there have been a number of works on possible 
connections between quantum theory and domain theory. For instance, Edalat 
[5] gives a domain-theoretic interpretation of Gleason’s theorem in the presence 
of partial information. Coecke and Martin [4] give a domain-theoretic treatment 
of the von Neumann entropy of a quantum state. 



3.1 Topological Quantum Computation 

An radically different direction in the semantics of quantum computation, and 
one which might lead to the discovery of new conceptual paradigms for quan- 
tum computation, is the work of Freedman, Kitaev, and Wang [6]. This line of 
work seeks to exploit connections between quantum computation and topological 
quantum field theories (TQFT’s). In a nutshell, in topological quantum compu- 
tation, a quantum state is represented by a physical system which is resistant to 
small perturbations. Thus, quantum operations are determined only by global 
topological properties, e.g., linking properties of the paths traversed by some 
particles. This leads to a potentially very robust model of quantum computa- 
tion. It also suggests that there is a more discrete, combinatorial way of viewing 
quantum computation, which might in turns suggest new quantum algorithms. 
These topological approaches to quantum computation are currently limited to 
a description of unitary operators; measurements are not currently considered 
within this model. 



4 Challenges 

There are many remaining challenges in the design and analysis of quantum 
programming languages. One such challenge is to give a sound denotational 
semantics for a higher-order quantum programming language, including classical 
features and measurement. While there has been recent progress on this issue, 
both on the syntactic side and on the semantic side, the connection between 
syntax and semantics remains tenuous at this point, and typically covers only 
fragments of the language. A related question is how to model infinite data types, 
particularly types which include an infinite amount of “quantum” data. 

Another challenge is to formulate a theory of “quantum concurrency” . This is 
not far-fetched, as one can easily imagine networks of quantum processes which 
communicate by exchanging classical and quantum data. There is a considerable 
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body of work in quantum information theory and quantum cryptography, which 
suggests some potential applications for quantum concurrent systems. 

Another interesting research area is the implementation of quantum program- 
ming languages on imperfect hardware. Unlike the idealized “virtual machine” 
models of quantum computation, one may assume that real future implemen- 
tations of quantum computation will be subject to the effects of random errors 
and decoherence. There are known error correction techniques for quantum in- 
formation, but it is an interesting question to what extent such techniques can 
be automated, for instance, by integrating them in the compiler or operating 
system, or to what extent specific algorithms might require customized error 
correction techniques. 
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Abstract. We have been studying abstractions of linked structures, in 
which cells are connected by pointers, using temporal logic. This paper 
presents some our results for these abstractions. The system to be verified 
is a transition system on a graph. The shape of the graph does not change 
as a result of the transition, but the label assigned to each cell (node) 
changes according to rewrite rules. The labels of cells are changed syn- 
chronously or asynchronously. We abstract such systems using abstract 
cells and abstract graphs. Abstract cells are characterized by a set of 
temporal formulas, and different abstractions can be tried by changing 
the set of formulas. Some examples of analysis are also described. 



1 Introduction 

Previously, we introduced a method for defining abstractions of heap structures 
in which cells are connected by pointers, mainly for the purpose of verifying algo- 
rithms for concurrent garbage collection [1, 2]. The basic strategy of the method 
is to abstract each cell in a heap structure in terms of regular expressions taken 
from a fixed finite set. Each regular expression represents a property of a cell 
concerning its connectivity with other cells. For example, the regular expression 
w*g holds for a cell that can reach a gray cell via white cells, where w is the label 
of a white cell and g is that of a gray cell (Figure 1). 




Fig. 1. Abstraction of a heap structure by regular expressions 



Then, we introduced branching-time temporal logic for representing such 
properties [3]. For example, the regular expression w*g can be replaced with the 
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temporal formula E(w until g) in CTL (computation tree logic) [4, 5]. Notice 
that the modality in temporal formulas corresponds to the connectivity in the 
heap structure. Using temporal logic, one can easily determine the relationship 
between the properties represented by formulas by deriving the implication or 
equivalence of the formulas. 

In our method, we first fix a finite set of temporal formulas F. We abstract a 
cell in a heap according to whether each formula in F holds for the cell (Figure 2). 
Formally, let L denote a heap structure and s be a cell in the heap. We use a(L, s) 
to denote the set of formulas in F that holds at s, i.e., {(j) G F \ s \=l </>}. We 
call a(L, s) an abstract cell. In the simplest setting, an entire heap structure is 
abstracted by the set of all abstract cells corresponding to the cells in the heap. 
We call a set of abstract cells an abstract heap. Conversely, ordinary heaps are 
called concrete heaps. 




Fig. 2. Abstraction of a heap structure {F = {E(w until g),w A EXg}) 



This abstraction method is similar to predicate abstraction, which has been 
well studied recently and is used for software model checking [6, 7]. In predi- 
cate abstraction, one introduces a set of predicates on states. Then, a state is 
abstracted using a set of predicates that hold at the state. Note that in ordi- 
nary predicate abstraction, the states change over time, while in our abstraction 
method, cells range over a heap structure and temporal formulas are used as 
predicates on cells. 

If we want to verify an algorithm operating on a heap structure, we should 
formulate it as a state transition system whose state is a heap. One step of 
the state transition may change the label of a cell, change the connectivity of 
a cell, delete a cell, add a new cell, etc. A state transition on a heap structure 
induces a state transition on an abstract heap. Therefore, if one can effectively 
compute state transitions on abstract heaps, one can verify properties of the state 
transition system on concrete heaps. Previously, we verified the safety property 
of concurrent garbage collection. 

The work reported here differs from our previous work in the following two 
points: 

— In our previous work, we used ordinary CTL and CTL*, so forward and 
backward connectivities were treated separately. In this work, we introduce 
2CTL (2- way computation tree logic), which allows us to treat forward and 
backward connectivities simultaneously [8] . 
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— We analyze both synchronous and asynchronous transition systems in a uni- 
form setting. Typical examples of synchronous transition systems composed 
of cells are cellular automata[9, 10]. By contrast, distributed systems are 
usually asynchronous. 

This paper deals only with state transitions that change only the labels of 
cells. State transitions that change the connectivity of cells, or delete/add cells 
are left for future work. 

The paper is organized as follows. Section 2 explains a concrete system that 
describes the target of verification. 2- Way CTL, the formal logic expressing the 
characteristics of a linked structure, is also explained in this section. Section 3 
introduces an abstract system that uses temporal logic to abstract the states of 
cells linked by pointers. Some examples are shown in Sect. 4. We show how to 
check satisfiability, which plays a central role in abstraction, in Sect. 5. Section 6 
contains our conclusions and remarks on future work. 



2 Concrete System and 2- Way Computation Tree Logic 

2.1 Concrete System 

Let S' be a set of cells, and A a non-empty finite set of labels for links. Then, we 
consider the graph G = (S, defined by S and a set {i?aCSxS|o€ A} 

of labeled links between cells . We assume that for each label a G A, there exists 
a G A such that 

Ra = {(si,S2) I (s 2 ,Sl) G Ra}, 

and a = a. 

Let P be a set of labels for cells. We call the labeling of cells with elements 
of P a state of cells. We consider a transition system on the graph G such that 
it changes the state of cells on the graph, but does not change the structure 
defined by G. The system is denoted by {S, {Ra}, Lq, A), where Lq '■ S ^ P is 
the initial state of the cells, and Z\ is a finite set of rewrite rules. Each rewrite 
rule has the form </> — > p, where (f is a, formula in 2-way computation tree logic 
(2CTL), which will be defined later, and p G P. 

There are two types of transition: synchronous and asynchronous. We say 
that a state L of the cells can make a synchronous transition to L' if the following 
condition is satisfied: 

Vs G S. 3{(j> ^ p) G A. {L'{s) = p) A (s \=L (t>) 

where s \=l 4> denotes that </> holds at s in 2CTL, which is also explained in 
detail below. Similarly, a state L of cells can make an asynchronous transition 
to L' if the following condition is satisfied: 

3s G S. (3(</) ^p) G A. {L'{s) = p) A (s \=l 4>)) A 
(Vs' gS.s' Lfs') = L(s')) 
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Namely, a synchronous transition represents the situation in which all the cells 
are rewritten simultaneously, and an asynchronous one represents the situation 
in which exactly one cell is rewritten. 

Hereinafter, we call a cell/graph/transition/system defined as above a con- 
crete cell/graph/transition/system, respectively, in contrast to their abstract 
counterparts, which are defined later. 

2.2 2- Way Computation Tree Logic (2CTL) 

In this section, we explain the 2-way computation tree logic (2CTL) that is used 
in the definition of transitions in concrete systems. 2CTL is an extension of the 
usual CTL with inverse modality. Several logics with inverse modality have been 
studied already. For example, Vardi gave a proof procedure based on automata 
theory for 2- way /x-calculus [8] . 

A (positive form) formula in 2CTL is defined as follows: 

(f> ::= p \ \ M(j) I (j) A (f> \ cj)\/ (f> 

where p G P is a, label for a cell in a concrete system, and it represents an atomic 
proposition in 2CTL. M is one of the following modal operators: 

AXa EX^ AGa EGa AFa EFa 

where A is a set of modality labels that corresponds to a set A C A of labels for 
links in a concrete system. As we mentioned in Sect. 2.1, there exists an inverse 
modality label a for each modality label a G A, and a equals a. We simply write 
AXq for AX{o}. Although we can add an until operator, this is excluded from 
this paper for brevity. 

The tuple (S', {i?a}, A) made of a concrete graph and a state of concrete cells 
can be regarded as a Kripke structure. In the rest of this subsection, we define 
the semantics of 2CTL positive form formulas with this structure. 

We say that an infinite sequence sqi si, • • ■ of cells is an A-path if for all i, 
there exists a G A such that SiRaSi+\. We say that a finite sequence sq, si, • • • , s„ 
is an A-path if for i < n, there exists a € A such that SiRaSi-^.l, and there is no 
a G A and t G S satisfying SnRat. 

For cell s and formula (j>, the relation s is defined as follows: 

— s \=L P iflf P = L{s) 

— s |=L AXA(f> iflf for any a G A and t G S satisfying sRA, t ^ holds. 

— s \=L EXyi^ iflf there exists a G A and t G S such that sRat and f (/). 

~ s |=L AGa4> iflf for any A-path s = sq, si, • • • starting from s, Si \=l holds 
for all i. 

— s \=L EGa4> iflf there exists an infinite A-path s = sq, si, • • • starting from 
s such that Si \=l <t> holds for all i. 

— s |=L AF/10 iflf for any infinite A-path s = sq, si, • • • starting from s, there 
exists i such that Si 1=l <)>. 

— s \=L EF A<i> iflf there exists an A-path s = sq, si, • • • starting from s and i 
such that Si \=l ‘P- 

We sometimes omit L from s\=l (p when it is obvious from the context. 
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If there exists a Kripke structure with state s satisfying s ^ we say that 4> 
is satisfiable. Satisfiability in 2CTL plays an important role in constructing the 
abstract graphs that are defined below. 

2.3 Examples of Concrete Systems 

In this section, we give some examples of concrete systems such that A = <— } 

(where ^ = <— ). In this case, we consider sR^s' to represent the situation in 
which s' is the right neighbor of s. We introduce some shorthand for the set of 
rewrite rules: [qi . ■ . qn]p[ri ■ ■ ■ Tm] p' represents {p A EX ,_(7 A EX^r — > p' | g G 
{gi, . . . , g„}, r G {ri, . . . , Vm}}- Moreover, ,p[ri . . . r^] ~^p' is used as shorthand 
for {p A EX_g A EX^r ^ p' | g G P, r G {ri, . . . , rm}}- We define [gi . . . g«]p- 
and .p. similarly. When we use this shorthand, we may write Z\i U A 2 as Ai \ A 2 
for sets Ai,A 2 of rewrite rules. 

Synchronous Transition Let S = 7Z, = {(i,i + 1) | z G ZZ}^ P = {0, 1}, 

Lo(0) = 1, Lq{s) = 0 (s yf 0), and the rewrite rules be Z\ = [1]0. — > 1 | .0[1] — > 1 | 
[0]1.— *-0 I .1[0]^0 I [0]0[0]— >0 I This system can be illustrated as 

follows: 

0 |0 I 1 I I 1 I 1 |0 I 

1 |0 |0 I |0 I 1 I 1 I 

1 |o I 1 I |0 I 1 |0 I 

1 1 

[Ti r n To] r 



0 |o |o I I 1 I 1 I 1 

1 1 

To] r n 1 1 1 



In this figure, the figure on the right represents the rewrite rules for the center 
cell. With these rules, a cell labeled 0 becomes 1 if at least one neighbor is 1, and 
a cell labeled 1 becomes 0 if at least one neighbor is 0. The uppermost infinite 
sequence of cells in the figure on the left represents the initial states of the cells, 
and they are changed simultaneously according to the rewrite rules. This is an 
example of a 1-dimensional cellular automaton. 

Asynchronous Transition Let n be an integer greater than 0. Moreover, 
let S = {0, 1, . . . , n}, R^ = {(z, z -I- 1) | z = 0, . . . , rz — 1} U {(n, 0)}, P = 
{T,U,E,t,u,e}, and the rewrite rules be Z\ = .T[TUt]^U | [TtuJU.^E | 
.E. T I [Ttu]t. u I .u[TUt] ^ e I .e. ^ t. 

This is an example of the so-called dining philosophers. Cell 0 corresponds 
to the philosopher who takes the left fork first (left-handed), and the others 
correspond to those who take the right fork first (right-handed). States T, U, 
and E are for right-handed philosophers, and represent thinking, picking up the 
right fork, and eating, respectively. Similarly, states t, u, and e are for the left- 
handed philosopher, and represent thinking, picking up the left fork, and eating, 
respectively. 



-|o|o|o|o|o|o|o|o|o|o|i|o|o|o|o|o|o|o|o|o|o 
-|o|o|o|o|o|o|o|o|o|i |o|i |o|o|o|o|o|o|o|o|o 
-|o|o|o|o|o|o|o|o|i |o|i |o|i |o|o|o|o|o|o|o|o 
-|o|o|o|o|o|o|o|i |0|l |0|l |0|l |0|0|0|0|0|0|(r 
-|0|0|0|0|0|0|l loll loll loll loll |o|o|o|o|o|o 
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3 Abstract Systems 

3.1 Overview of Analysis Using Abstract Systems 

In this section, we consider an abstraction of the concrete systems given in 
Sect. 2.1 using abstract cells/graphs/transitions. The analysis involves by the 
following procedure: 

1. Compute the initial abstract graph a(Lo) from the initial concrete state Lq 
of the cells (Section 3.2). 

2. Iteratively compute the abstract graphs that are reachable with respect to 
abstract transitions (Section 3.3). 

3. With these operations, exhaustively collect the concrete graphs that are 
reachable in the concrete system using abstract graphs. Then, verify their 
properties on abstract graphs. 

As explained below, an abstract graph is constructed from a finite set of 
abstract cells. Therefore, the number of abstract graphs is also finite. 

We define the abstract system so that it has the soundness property with 
respect to the concrete system in the following sense: 

— If L is the state of cells after n steps from the initial state, then for any 
concrete cell s, there exists the corresponding abstract cell a{L, s) in an 
abstract graph that is reachable by n steps from the initial abstract state. 

— If L is the state of cells after n steps from the initial state, then for any 
concrete cells s and s' such that sRas', there exists an abstract link labeled 
a from a(L, s) to a{L, s') in an abstract graph that is reachable by n steps 
from the initial abstract state. 

Namely, we define an abstract system so that the abstract graph can conser- 
vatively simulate the neighborhood in the concrete system. With this soundness 
property, if we can show a safety modal formula, i.e., one that uses only AX^ 
and AGa, then it is also satisfied in the concrete system. 

3.2 Abstract Cells and Abstract Graphs 

In the following, we make extensive use of the derivation in 2CTL when con- 
structing an abstract system. When we derive formulas in 2CTL, we can use not 
only 2CTL theorems but also the properties of concrete cells as axioms. In the 
previous example of a cellular automaton, since the number of cells to the right 
and left of each cell is exactly one, AX_.^ and EX^^ are equivalent. Likewise, 
AX^cj) and EX^cj> are equivalent. Hence, the negation of AX^O is equivalent to 
AX_,1 because the negation of 0 is 1. Likewise, the negation of AX_^AX^AG_,0 
is equivalent to AX_.AX^EF_,1. We implicitly assume such axioms determined 
by the area of the problem in reasoning with 2CTL. 

An abstract system is defined as a transition system on abstract graphs. In 
turn, an abstract graph is defined in terms of abstract cells. Therefore, we begin 
by explaining abstract cells. 



Analysis of Synchronous and Asynchronous Cellular Automata 



13 



To define abstract cells, we have to give a finite set F D P of 2CTL formulas 
that includes all the elements in P. We can try different kinds of abstraction by 
changing the way the set F is selected. 

For C C F, we write 4>c for 



( A '^1 ^ ( A ■ 

\0ec / \<j>eF-c ) 

We consider subset C to represent a concrete cell at state L of cells such that 
s \=L 4’C- Therefore, we call subset C of F an abstract cell if (pc is satisfiable 
and it contains exactly one element in P. Otherwise, there are no corresponding 
concrete cells so we do not regard it as an abstract cell. 

We have to choose F so that the truth or falsehood of the rewriting conditions 
(i.e., (/) in (p^p) G A) in the concrete system can be determined from pc for 
each concrete cell C. 

For abstract cells C, D, and a label a G A, we can place an abstract link 
labeled a from C to F if both pc A FXapo and po EX^^c are satisfiable. 

Given a set Ab. F 2^ of subsets of modality labels A, we consider the reach- 
ability formulas for a set of abstract cells. We introduce an atomic proposition 
Pc for each concrete cell C. Then, the reachability from an abstract cell C to 
either of the abstract cells Fi, . . . ,F„ using the modality labels in A G is 
expressed using the following formula: 

PC A EFa{pd^ V • • • V pd„). 

This formula represents the situation in which for any concrete cell s that is 
represented by the abstract cell C, there exists a finite A-path from s to t such 
that t is represented by either of the abstract cells Fi, . . . , F„. 

An abstract graph (C,{£a},7?.) is defined using the components mentioned 
above: 

— A set C of abstract cells. 

— A set {Co] of labeled abstract links that satisfy the above condition. 

— A set 7?. of reachability formulas. 

Given a concrete graph G = {S, {i?a}), a state L of concrete cells, a set F of 
2GTL formulas, and a set Ab of subsets of modality labels, we define an abstract 
graph a{L) = (C, {£a},F.) as follows: 

~ Define C = {a{L, s) | s G S'} where a{L, s) is an abstract cell corresponding 
to a concrete cell s defined as follows: 

a{L,s) = {p G F \ s \=L P}- 

— {Ca\ is defined as 'iC\Ci G C. CxdaCi 3siS2- s\RaSi and Ci = 

a{L,s\) and C 2 = a{L,S 2 )- Note that this {£a| satisfies the condition for 
abstract links. 
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— 7?, is a set of reachability formulas derived from the reachability between 
corresponding concrete cells, i.e., 

= {pc 3 EFyipD I A G ^R, Vs. C = a{L,s) 

3t. D = a{L, t) and there exists a finite ^-path from s to t in G} 

Then, the following properties hold for arbitrary L: 

— For a concrete cell s, there exists an abstract cell a{L, s) in a{L). 

— If sRas' holds in a concrete graph, then there is an abstract link labeled a 
from a{L, s) to a{L, s') in a{L). 



3.3 Abstract Transitions 

An abstract transition involves the following three steps: 

1. Rewrite the abstract cells according to their formulas and the rewrite rules 

in the concrete system. 

2. Reevaluate the formulas in the abstract cells. 

3. Merge the abstract cells having the same set of formulas. 

In the first step, a new state q G P is computed for each abstract cell C using 
the rewrite rules. For an abstract graph (C, {Ca\, P), a tuple (C^, {Gq}, TZ^) of a 
graph and a reachability formulas is called a result of a rewrite if the following 
condition is satisfied: 

— For a synchronous transition: 

(C^ = {(G, q) \ C G C, 3{(j) ^ q) G A. (f>c A (j) is derivable} A 

(V(G,9),(G',g') G (G,g)£i(G',g') ^ G£„G') A 

(7^l = {P(c,q) A EFA(Vi I PC A EFa{\/,PdJ G 7^, 

{C,q)GC\ (A,g.,)GCi|) 

~ For an asynchronous transition: 

(3G G C. 3{4> q) G A. (pc A ip is derivable A 

Ci = {(G,g)|U{(G',g') | G' G C, {g'j = G n P|) A 
(V(G,g),(G',g') G Ch (G, (G', g') ^ G£,G') A 

= {p(c.g) 3 EFA(Vi I PC A EFa{\/,Pd,) G TZ, 

{C,q)GC\ (A,g.,)GCn) 

A pair (G, q) G represents the situation in which the state of a concrete 
cell that is represented by G is rewritten to q G P. For a synchronous transition, 
an abstract cell G can represents multiple concrete cells that can be rewritten 
to different results if the rewrite rules are nondeterministic. Thus we consider 
all the possible results of the rewrite at once. 

For a synchronous transition, the result of rewrite {C^ , {C\} ,TZ^) is deter- 
mined uniquely if it exists. By contrast, there may be multiple results of a 
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rewrite for an asynchronous transition. In such a case, the reevaluation step (the 
second step) is performed for each result of the rewrite. 

Next, we reevaluate a set of formulas in F for each abstract cell with a new 
state (C, ( 7 ) G C^. Namely, for each cf> G F, we examine whether ip or ~^(j> holds 
(or we cannot determine which of them holds) at the concrete cells that (C, q) 
represents. 

This determination is done by checking the satisfiability of P{c,q) ^ ^ 
that of P(c,q) under the following assumptions: 

— Axioms determined by the area of the problem. 

“ P{C,q) 3 9 for each (C, q) G . 

~ P{C,q) 3 if 9 7 ^ 9^ for each (C, q) G C^. 

— P(C,q) 3 AXa(V{p(c", 9 ') I {Ci<l)C-a{C\q')}) for each (C, g) G Cb 

— Reachability formulas Ti} . 

We simply say that P[c,q) A (p (or P{c,q) A ^(p) is satisfiable in (C^, {£),}, 7?.^) 
when it is satisfiable under the above assumptions. 

In general, we cannot determine the truth/falsehood of non-universal for- 
mulas such as the one containing EF^ using only the first four assumptions 
above, because the existence of abstract links does not assure the existence of 
the corresponding concrete links. Therefore, we include the reachability formu- 
las in an abstract system so that we can use more information to determine the 
truth/falsehood of formulas in F. In general, we can include any kinds of formu- 
las among abstract cells other than reachability, provided the formulas can be 
inherited conservatively when the operations, such as rewriting, splitting, and 
merging, in abstract transitions are performed. 

Even if we make use of reachability formulas, perhaps both P{c,q) A (p and 
P{C,q) A -■(/) are satisfiable in {C^ , {C\} ,F}) . Then, we have to split the case into 
one in which formula (p is true and one in which it is false. This corresponds 
to splitting an abstract cell, and it is expressed as the following function 7 : for 
(C, q) G C^, we define 

7 (C, q) = {(C, C') lC'CF,C'nP = {g} and 

rp G C' n F => P(c,q) A ^ is satisfiable in (C^, and 

4>G F\C P(C,q) A ~^(p is satisfiable in (C^, {C\}, 'R})} 

The result of the reevaluation from (C^, {£j}, R}) is defined as (C^, {£q}, R-^) 
where 

C 2 = U{7(C,g)|(C,g)€£i} 

V(Ci,C(), (C 2 ,C^) GCb (Ci,C()£2(C2,C^) ^ 

(3gi g 2 . (Ci,gi)£;((C 2 ,g 2 ), {gi} = n P, {g 2 > = n P) and 
both (pc' A EXa(pc and (pc' A EXa(pc' are satisfiable. 

= {P(C.C') ^ EFa(V, ViPlD.D') I {D,D') G 7 (Pi,g*))} | 

3g. (C,C") G 7(C,9) and {p(c,q) £ a{\J ,P{D i,qi))) G R-'^} 

Namely, the set £^ of links is constructed by inheriting links in £^ but 
excluding the ones that do not satisfy the condition for abstract links. 
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After the reevaluation of all the results of rewrites is finished, multiple ab- 
stract cells that have the same set of formulas are merged as the third step of 
an abstract transition. This merge operation enables us to bound the size of the 
abstract graph using the number of different abstract cells. 

The result (C', {£'^},7Z') of a merge is defined as follows: 

C' = {C I 3C. (C',C") € C^} 

VC(, C' e C'. C' ^ 3Ci C 2 .{CuC[)Cl{C2, C') 

Tl' = {pc £ EFa(pd[ V • • • V pD'J I 
VC. (C,C') gC2^3A...D„C"...D". 

{P(C,C) £ EF a{P{Di,D'{) V • • • V ))) G 7^^ A 

As for an asynchronous transition, the repeated computation of abstract 
transitions constructs a tree of transitions in general, because there are multiple 
results (C', {£q}, TZ') of the abstract transition for a single (C, {£a}, TZ). We may 
conservatively merge these results of the abstract transition using the merge 
operation similar to the above one in order to avoid the number of the set of 
abstract graphs becomes huge (although it is bounded) . 



4 Examples 



4.1 Synchronous Transition: 1-Dimensional Cellular Automaton 



Abstract Graph In this section, we give an abstraction of 1-dimensional cel- 
lular automaton as an example of a synchronous system. The concrete system 
was given in Sect. 2.3. 

The following figure illustrates the abstract graph of the initial concrete 
graph. The dotted lines represent the abstraction function a. 




Cl C2 C3 C4 C5 



In the above figure, 0^ and ♦ are shorthands of AX^, AX,_, AG^, 

AG^, EF^, and EF^, respectively. 

In this example, we take F as the collection of the following formulas. 

0 AX_.0 AX^O 

AX^AX^AG_0 AX^AX^AG_1 AX_AX_AG_0 AX_AX^AG_1 

As we noted earlier, there are axioms that can be derived from the area 
of the problem. For example, the negation of AX^AX^AG^O is equivalent to 
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AX^AX_^EF^ 1 . So in the above figure, the formula corresponding to 4> G F —C 
is shown as the formula that is equivalent to the negation 

Abstract links labeled s-” are derived from the concrete initial state are 
drawn in solid lines. Reachability formulas are {pCi A | i < j} if we 

name abstract cells Ci, . . . , C 5 from left to right. 



Abstract Transition Next, we compute the result of rewrite as the first step 
of the abstract transition. In this example, we have either 0 or 1, either AX^O or 
AX^l, and either AX^O or AX^l in each abstract cell, so the result of rewrite 
is determined as follows: 




Cl C2 C3 C4 C5 



Then, we reevaluate each formula in F. Formulas containing only AX^, AG_,, 
AX,_, AG,— are true if it is true when we regard the above figure as a Kripke 
structure. For example, AX^AX^AG_^0 holds at C 3 . 

The truth/falsehood of a formula containing EF_, is sometimes determined 
from reachability. Here we consider the formula AX^AX^EF^l as an example. 
Note that its negation is equivalent to 'tp = AX^AX^AG^O as we mentioned 
before. Then, pci A V' is unsatisfiable because pci ^ AX-,AX_(pc'^ V pc 2 V PCa) 
holds, and (pci V PC 2 V pcs) EF^pc 4 is derivable using reachability formulas. 
That corresponds to the fact that it is not the case that the right neighbor of 
the concrete cell corresponding to Ci is always abstracted to Ci . 

As we mentioned, truth or falsehood of some properties in F may not be 
determined in reevaluation in some cases. For example, we can say neither AX^O 
nor its negation AX^l for Ci . So this cell is split into two cells, one having AX^O 
and the other having AX^l. 

As a result of reevaluation and reconstruction of abstract links, we obtain 
the following abstract graph: 




If the result of reevaluation produces the multiple abstract cells having same 
set of formulas in F, they are merged. Repeatedly applying abstract transitions, 
we reaches to the following fixed point of the abstract graph: 
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4.2 Asynchronous Transition: Dining Philosophers 

For the example of dining philosophers that we mentioned in Sect. 2.3, we can 
show that no deadlock occur if there are n{> 1) right-handed philosophers and 
exactly one left-handed philosopher independent of n. 

We take a set F that characterize abstract cells as S' U {EX_*p \ p & S} U 
{EX^p I p G S}. To avoid complicated notation, we use shorthand [o]p[/3] to 
represent a set of abstract cells, where p is a state of the cell itself, a and f3 are 
a sequence that represents a set of states of left and right neighbor cell can take, 
respectively. For example, [Tt]T[Tt] represents four abstract cells. Reachability 
formulas are not used in this example. 

With the above notation, the initial abstract cells where all philosophers are 
thinking are represented as [Tt]T[Tt], [T]t[T]. 




Starting with this initial abstract cells, the reachable ones are computed as 
follows: 



[TUEtue]T[TUEtue], [TUEtue]U[TUt], [Ttu]E[TUt], 
[TUE]t[TUE], [T]u[TUE], [T]e[TU] 

Because there is exactly one left-handed philosopher, left and right neighbors 
of the left-handed philosopher in the concrete system must be represented by 
either of [TUE]t[TUE], [T]u[TUE], or [T]e[TU]. For each case, we show that 
there exists at least one concrete cell that can be changed its state by the tran- 
sition. For those including e or E, it can be changed to t or T by the transition, 
respectively. For [T]u[TU], there is a transition to [T]e[TU]. 

The remaining case is [TU]t[TU]. This can be divided into [T]t[TU], [U]t[T], 
and [U]t[U]. For the first one, there is a transition [T]u[TU]. The last one can 
be changed to either [U]t[E] or [E]t[E] (the latter case occurs when there is only 
one right-handed philosopher). 

Then the remaining case is now [U]t[T]. Here, the state of the second right 
neighbor of the left-handed philosopher must not be t, u, or e. The latter two 
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cases contradict with the fact that there is only one left-handed philosopher, 
and the first one also leads to contradiction because the left neighbor of the 
left-handed philosopher is now picking up the right fork. Therefore, the possible 
state of the second right neighbor is either T, U, or E. For T and U, there is a 
transition from [UltlTllTUl to [UltfUllTUl. Finally, for E, there is a transition 
from [U]t[T][E] to [U]t[T][T], 

5 Checking Satisfiability 

For the completeness of the paper, we briefly describe an algorithm for checking 
satisfiability of 2CTL in this section. This is a straightforward extension of the 
standard tableau method for CTL [11]. 

For a formula (f>o, we define a set cl{(j>o) of formulas as the smallest set 
satisfying the following conditions: 

— 00 € c/(0o)- 

~ If 01 A 02 C c/(0o), then 0i G cZ(0o) and 02 G c/(0o). 

— If 01 V 02 G cZ(0o), then 0i G cZ(0o) and 02 G cZ(0o). 

— If M0 G cZ(0o), then 0 G c/(0o). 

— If 0 G cZ(0o) and expand{4>) is defined, then expand{4>) G cZ(0o), where 
expand{(j)) is defined as follows: 

expand{kQA<P) = 0 A AX^AGa0, expand {EG a4>) = 0 A EXaEGa0, 
expand {AF A(j)) = 0 V AX^AF^0, expand{EF a4>) = 0 V EX^EFa0. 

Suppose r C cZ(0o). If 01 V02 G r implies /(0i V02) G {1, 2}, and EX^i0 G F 
implies /(EX^0) G A, then we call / a choice function on F. When the following 
conditions are satisfied, a pair (T, /) of F and a choice function / on F is called 
a 00-type. 

— p G F and ~^p G F do not hold simultaneously. 

— If 01 A 02 G F, then 0i G F and 02 G F. 

— If 01 V 02 G F, then 0/(<^iV02) ^ 

— If 0 G F and expand {4>) is defined, then expand {(f) G F. 

For each modality label a, a transition relation (F, /) (F', /') between 

00-types is defined as follows: 

— There exists EXa4> G F such that /(EXa0) = a and 0 G F'. 

— For all AXai/j G F , tjj G F'ii a G A. 

— For all AXaiP G F' , ijj G F if a G A. 

— There is no AF^0 satisfying the following conditions: 

a G A and a G A, AFa" 0 G F and AF^0 G F', 

/(0. V AXaAFaV') = 2, f(i/:VAX^AF^0) =2. 

The last condition above characterizes the fact that 2-way CTL has the inverse 
modality. 

The procedure below is to determine satisfiability of the given formula 0o by 
repeatedly removing inconsistent 0o-types. 
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When EKa4> G r holds, if all 0o-type f) satisfying {F, f) ^ 
have been removed, then remove {F, /) . 

Let "00 = AF^'0 or ipQ = EF^^/j. Suppose (/qj/o) is a (/ig-type, and 'i/'o G -^o- 
We consider a labeled finite tree Tq where the label of each node is a cf>o-type. 

— The root node has (Iq, /q) as its label. 

— For any node and its child, there holds a transition relation A between labels 
of two nodes for some a G A. 

— For any node in Tq, its label {F, /) satisfies tpo G F. 

In the case that tpo = AF^^/>, the following conditions are added: 

— For any internal node in Tq and its label (F, /), for any G F satis- 

fying f{EXB(t>) G A, there exists its child whose label is {F',f') such that 

(r, /) {F\ f) and </. € F' hold. 

— For any internal node in Tq and its label (F, /), f{ip V AX^'0o) = 2 holds. 

— For any external node in Tq and its label (F, /), either /(^ V AXaV'o) = 1 
holds or there exists no EX^^ G F such that /(EXs^) G A. 

In the above conditions, an internal node in Tq is a node that has at least one 
child, and an external node is a node that is not an internal one. 

In the case that tpo = EFa4’: the following conditions are added: 

~ To is a path from its root to the unique external node and satisfies the 
following conditions. 

~ For any internal node on the path and its label (F, /), /(^ V EXa^Ijo) = 2 

holds, and for the next node on the path and its label (F', /'), (F, /) 

{F', f) and i/'o e F' hold. 

~ For any external node on the path and its label (F, /), f{tjj V EXaV'o) = 1 
holds. 

If there exists no tree Tq satisfying the above conditions, we remove (Fq, /q). 
We repeatedly remove (/ig-types as much as possible. If (^o-type (F, /) satis- 
fying (po G F remains at the stage where we cannot remove anymore, then (pQ is 
satisfiable. 

6 Conclusion and Future Work 

We described some results of our analysis method using abstraction by 2-way 
CTL for a linked structure in which labels of cells are changed by synchronous 
or asynchronous transitions. Many directions can be considered for future work: 

~ Implementation of the satisfiability checking algorithm. 

— Extension to more expressive logics, such as CTL*, /z-calculus, and guarded 
first-order logic [12]. 

— Extension to the logic that reflects the properties on concrete cells. 
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— Abstraction of “hybrid cellular automata”, i.e., systems having not only 
discrete labels but also continuous (e.g., real-time) parameters in a state 
of a cell. For this purpose, we need extension to logics with time such as 
TCTL [13]. 
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Twelf and Delphin: 

Logic and Functional Programming in a 
Meta-logical Framework 



Carsten Schiirmann 
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Logical framework research studies the design of meta-languages that support 
efficient encodings of object-languages and deductions prevalent in the areas of 
programming languages, mobile code and logics design. A meta-logical frame- 
work is a logical framework that, in addition, supports reasoning about these 
encodings. Over the previous decades various meta-logical frameworks have been 
developed and employed among them Nuprl, Isabelle, Coq, Lego, Maude, Elan 
and, of course, the meta-logical framework Twelf [PS99] that is being discussed 
here. [Pfe99] provides an excellent overview and a historical perspective of the 
field of logical frameworks. The fields of programming language design, safe and 
secure mobile code, verification of authorization and other security protocols and 
lately even experiment design in biology drive the development of the aforemen- 
tioned frameworks and greatly benefit from their progress. 

Twelf ’s design is based on the dependently typed Edinburgh logical frame- 
work LF [HHP93] and directly supports the judgment-as-types paradigm for 
encoding object languages and their semantics. Encodings may be higher-order 
and can therefore be used to capture binding constructs of languages as well as 
the meaning of hypothetical judgments without the standard restrictions usu- 
ally associated with inductive datatypes (i.e. the positivity condition [PM93]). 
Reasoning about LF encodings takes place outside the logical framework, i.e. in 
a special purpose meta-logic designed for LF. In this respect, the design phi- 
losophy underlying Twelf is complementary to that of FOL^^ [MM97], a fixed 
meta-logic that can be parameterized by different logical frameworks. Besides 
the meta-theorem prover that searches for proofs within that meta-logic [SchOO], 
Twelf is equipped with a logic programming language called Elf that works di- 
rectly with higher-order encodings. 

Recently, my students and I have developed a functional programming lan- 
guage called Delphin [SP03b], that admits higher-order LF objects as first-class 
citizens and thus provides an alternative to the usual interpretation of datatypes. 
Delphin’s most prominent features include function definition by cases; pattern 
matching against arbitrary LF objects (including those of functional type); gen- 
eral recursion; runtime extensible datatypes; a dependent type system and a 
novel world system that guarantees sound usage of dynamic constructors, ter- 
mination and a coverage checker [SP03a] that checks for non-exhaustive sets 
of patterns. The projected goals of the Delphin project include access to logi- 
cal framework technology for functional programmers, richer type systems that 
capture more refined invariants and support for shorter and more concise code. 
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Delphin can be used to write type preserving compilers, proof-carrying architec- 
tures, proof emitting theorem provers and even proof transformations in between 
different logics. 

Twelf and Delphin are closely related because they share the logical frame- 
work LF as a common data structure and they provide two different yet closely 
related operational semantics, one for logic and the other for functional program- 
ming. Driven by practical considerations, it has become important to be able to 
translate logic programs into functional programs in order to study their meta- 
theoretic properties. For example, in Twelf, logic programs play the role of proofs 
if and only if it can be shown that they are total, i.e. terminating and cover all 
cases. Coverage, however, is notoriously difficult to define and decide for Elf logic 
programs, but relatively straightforward for Delphin programs. Challenges that 
are addressed by this translation include a characterization of the interestingly 
large subset of logic programs as domain, the treatment of existential variables 
and the removal of deep backtracking. 

In my talk, I will focus on Twelf and Delphin and the relation between the 
two. 
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Abstract. Justification is the process of constructing evidence, in terms 
of proof, for the truth or falsity of an answer derived by tabled evalu- 
ation. The evidence is most easily constructed by post-processing the 
memo tables created during query evaluation. In this paper we intro- 
duce online justification, based on program transformation, to efficiently 
construct the evidence during query evaluation, while adding little over- 
head to the evaluation itself. Apart from its efficiency, online justification 
separates evidence generation from exploration thereby providing flexi- 
bility in exploring the evidence either declaratively or procedurally. We 
present experimental results obtained on examples that construct large 
evidences which demonstrate the scalability of online justification. 



1 Introduction 

Explaining or understanding the results of logic program evaluation, to aid in 
debugging or analyzing the program, has been a very active field of research. The 
complex machinery used to evaluate tabled logic programs makes the generation 
of explanations for query evaluation considerably difficult. Justification [17, 9] 
is the process of generating evidence, in terms of a high level proof, based on 
the answer tables created during query evaluation. Justification has two impor- 
tant advantages over techniques which generate evidence based on execution 
traces, viz. (i) it is independent of the evaluation machinery (e.g. the techniques 
scheduling goals in a tabled evaluation), and (ii) it enables direct construction of 
“meta-evidences”: evidences for proof systems implemented as logic programs. 
Justification has played a fundamental role in generating proofs or counter ex- 
amples for several problems in automatic verification (e.g., see [15, 16, 1]). 

In earlier works [17, 9], we presented justification algorithms for logic pro- 
grams by post-processing the memo tables created during query evaluation. Jus- 
tification in this post-processing fashion is “non-intrusive” in the sense that it 

* This research was supported in part by NSF grants CCR-9876242, IIS-0072927, 
CCR-0205376, CCR-0311512, and ONR grant N000140110967. 
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is completely decoupled from query evaluation process and is done only after 
the evaluation is completed. However, post-processing introduces performance 
overheads which affect the scalability of the technique. In this paper we present 
an online technique for justification, which generates evidence, whenever possi- 
ble, during query evaluation itself. Online justification is presented as a program 
transformation, and hence is still independent of the query evaluation machinery. 

At a high level, our online justification technique transforms the given pro- 
gram in such a way that the query evaluation in the transformed program auto- 
matically constructs the evidence. For each literal in the program, we create two 
transformed literals: one to generate the evidence for its truth, and the other 
for its falsity. Evidence for derivability (truth) of an answer can be easily con- 
structed by adding an extra argument in the predicate definitions that captures 
the evidence. However, in the presence of tabled evaluation, the extra evidence 
argument causes serious performance problems since the number of proofs for 
a goal may far exceed the number of steps it takes to compute the goal. For 
instance, consider the problem of a parser for an ambiguous grammar [21]: de- 
termining whether there is a parse tree can be done in time cubic on the length 
of the string whereas there may be exponentially many parse trees. We use a 
different transformation scheme for tabled predicates, storing the first evidence 
for an answer in a database, and thereby avoid this blow-up (see Section 3). 

Generating evidences for false literals is more difficult, since evaluation of 
false literal simply fails without providing any information regarding the failure. 
We generate evidences for false literals by first constructing a dual program 
(based the notion of completed definition [11]) such that of a literal in the dual 
program is true if and only if its corresponding literal in the original program 
is false. Thus, evidences for false literals are constructed using the evidences for 
the corresponding (true) literals in the dual program (see Section 4). 

Related Work: Extensive research has been done to help debug, analyze, explain 
or understand logic programs. The techniques that have been developed can be 
partitioned into three (sometimes overlapping) stages: instrumenting and exe- 
cuting the program, collecting data, and analyzing the collected data. Most of 
the techniques focus primarily on one of these three stages. For instance, the 
works on algorithmic debugging [18], declarative debugging [10, 13], and asser- 
tion based debugging [14] can be seen as primarily focussing on instrumenting 
the program; works on explanation techniques (e.g.,[12, 4, 19]) focus primarily 
on data collection; and works on visualization (e.g. [4, 2]) focus on the data 
analysis stage. Justification focusses on the data collection stage. 

Unlike algorithmic debugging, justification only shows those parts of the 
computation which led to the success/failure of the query, and unlike declara- 
tive debugging, justification does not demand any creative input from the user 
regarding the intended model of the program, which can be very hard or even 
impossible to do as will be the case in model checking [3]. Among explanation 
techniques, [19] proposes a source-to-source transformation technique, which is 
very similar to our technique, to transform logic programs in the context of de- 
ductive databases. This technique generates evidence in bottom-up evaluation 
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and is limited non-tabled programs, making it expensive especially in the pres- 
ence of redundant computations. A later work [12] generates explanations by 
meta-interpreting intermediate information (traces) computed by the database 
engine, and overcomes the problems due to redundancy by caching the results. 
However, the explicit cycle checking done when generating explanations imposes 
a quadratic time overhead for evidence generation. 

Trace-based debuggers (of which Prolog’s 4-port debugger is a primitive ex- 
ample) provide only a procedural view of query evaluation. Procedural view 
provides information about the proof search, rather than the proof itself. The 
complex evaluation techniques used in tabling make this procedural view virtu- 
ally unusable. Moreover, in the presence of multiple evaluation strategies and 
engines (SLG-WAM [20], Local vs. Batched scheduling [7], DRA [8] etc.), there 
is no uniform method to convert a trace of events during a proof search into 
a view of the evidence (proof, or its lack thereof) itself. Finally, by delinking 
evidence generation from evidence navigation, justification enables a user to 
selectively explore parts of the evidence and even re-inspect an explored part 
without restarting the debugging process. 



Online Justification vs. Post- Processing: We originally presented a technique 
for justification of tabled logic programs [17] and later refined the technique to 
efficiently handle programs that mixed the evaluation of tabled with nontabled 
goals [9]. However, both the techniques post-processed the memo tables to build 
evidences. To understand the two main drawbacks of these techniques, consider 
the evaluation of query p over the tabled logic program given below: 

table p/0. 

p p. p q. q. 

Post-processing-based justification is done by meta-interpreting the clauses of 
the program and the memo tables built during evaluation. For instance, the 
evidence for the truth of p is constructed by selecting a clause defining p such 
that the definition is true. Note that, in this case, the right hand sides of both 
clauses of p are true. For p to be true in the least model, its truth cannot be 
derived from itself. Hence the first clause is not an explanation for the truth 
of p. The meta-interpreter keeps a history of goals visited as it searches for an 
explanation and rejects any goal that will lead to a cyclic explanation. It will 
hence reject the first clause. Further, the justifier will produce the explanation 
that p is true due to q, which in turn is true since it is a fact. 

First of all, note that justification appears to perform the same kind of search 
that the initial query evaluation did in the first place to determine the answers. 
Worse still, meta-interpretation is considerably slower than the original query 
evaluation. Secondly, determining whether a goal has been seen before is exactly 
what a tabling engine does well, and one which is tricky to do efficiently in a 
meta-interpreter. The justifiers we had built earlier keep the history of goals 
as a list (to make backtracking over the history inexpensive), and hence cycle- 
checking makes the justifier take time quadratic in the size of the evidence. 
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In contrast, online justification generates evidence by evaluating a trans- 
formed program directly, exploiting the tabling engine’s (optimized) techniques 
for cycle detection, and eliminating the meta-interpretation overheads. 

Justification in Practice: We present the justification technique initially for pure 
logic programs with a mixture of tabled and nontabled predicates and stratified 
negation. The technique can be readily extended to handle programs with builtin 
or nonlogical predicates (e.g. var) and aggregation (e.g. f indall). Programs in- 
volving database updates (e.g. assert) and tabling are very uncommon; never- 
theless our technique can be extended to such programs as well (see Section 5). 

We have implemented an online justifier in the XSB logic programming sys- 
tem [22] and used it to build evidences in a complex application: the XMC model 
checking environment [15]. We use the XMC system as the primary example, 
since (i) evidences generated for XMC are large and have different structures 
based on the system and property being verified, thereby forming a platform 
to easily validate the scalability and understand the characteristics of the evi- 
dence generation technique; and (ii) counter-example generation was added to 
XMC using justification without modifying XMC itself, thereby demonstrating 
the flexibility offered by justification. Preliminary performance evaluation indi- 
cates that the online justifier for the XMC system adds very little overhead to 
the XMC model checker (see Section 6). When the model checker deems that a 
property is true, the overhead for justification to collect that proof is less than 
8%. When a property is false, generating the evidence for the absence of a proof 
the overhead is at most 50%. In contrast, the post-processing based justifier 
originally implemented in the XMC system had an overhead of 4 to 10 times the 
original evaluation time [9], regardless of the result of the model checking run. 

2 Evidence for Tabled Logic Programs 

In this section, we give the intuition and formal definition of evidence in the 
context of tabled logic programs with left-to-right selection rule. By evidence, 
we mean the data in a proof, i.e., the subgoals and derivation relation between 
the subgoals. Proofs for (non-tabled) logic programs are traditionally represented 
by trees, or so-called “proof trees”, where a goal is recursively decomposed to 
a set of subgoals until the subgoal indicates the presence or absence of a fact. 
In the case of tabled logic programs, however, a proof is not necessarily a tree 
because of the fixed-point semantics of tabling, i.e., a tabled predicate may fail 
due to circular reasoning. In [17, 9], proof trees are augmented by “ancestor” 
nodes or edges to form justifications for tabled programs, essentially indicating 
that the derivation relation of the subgoals is potentially a graph with loops. 

Formally, we define an evidence (for a tabled logic program) as a graph 
whose vertices are literals and their truth values, and edges reflect the derivation 
relation between the subgoals represented by the literals. We use succfu) to 
denote the set of successors of a vertex u in a graph. 

Definition 1 (Evidence). An evidenee for a literal L being true (false) with 
respeet to a program V, denoted by £p{L), is a directed graph (V,E) sueh that: 
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1. Each vertex v G V is uniquely labeled by a literal of V and a truth value, 
denoted by {1 {v),t{v)); 

2. There exists a vertex vq V such that l{vo) = L, t(vq) = true{false) , and 
all other vertices in V are reachable from vq; 

3. For each vertex v €V 

(a) if t{v) = true, succ(v) = . . . ,v'^} if and only if 

i 3 C = {a (3i, . . . ,/3„) G P and a substitution 9 : 
ae = l{v)A{fk,...,Pn)e={l{v[),...,l{v'J). 
ii. \/l < i < n : t{v[) = true 
Hi. {v, v) ^ E~^ 

(b) if t(v) = false, succ(v) is the smallest set such that 

V C = {a Pi, f3n) & P ■ 

V substitution 9 : a9 = l(v)9 
3 \ < k < n and {v[, . . . , v'j^} C succ{v) : 
{Pi,...,Pk)9={l{v[),...,lW^))9 
A (V 1 < z < /c : t{v[) = true) 

A T^v'f.) = false 

Intuitively an evidence carries only the relevant information to establish a 
literal’s truth or falsity. It is an AND-graph where each node is supported by 
all its successors. For a true literal, only one explanation matching a predicate 
definition is needed (Item S.a.i); for a false literal, it must be shown that ev- 
ery possible combination of its predicate definition ultimately fails (Item S.a.i). 
Furthermore only false literals can be in a loop, due to the least fixed-point 
semantics obeyed by tabled logic programs (Item S.a.ii). 

Definition I is logically equivalent to the definitions of justification in [17, 9] 
which define a spanning tree of evidence, where the backward edges are labeled 
by “ancestor” and leaves with truth value true and false are labeled by an edge 
to node “fact” and “fail” respectively. The benefit of the new definition is that 
same result will be generated from different traversal orders. Applying the result 
of [17, 9], we establish the usefulness of evidence by the following theorem. 

Theorem 1 (Soundness and Completeness). The query of a literal I suc- 
ceeds (fails) if and only if there is an evidence for I being true (false). 

Hereafter when P is obvious from the context, we abbreviate £p{L) to £{L). 
Sometimes we also abbreviate v's label (1{v),t{v)) to l{v) or ^l{v) depending on 
whether r(u) is true or false. 

Example 1. Consider the following three programs: 

Pi: p. P 2 : p :- q. P 3 : :- table p/ 0 . 

q. p :- q. 

q :- p. 

The evidence for p being true in P\ is just a single node labeled by p. The 
node has no successor because it is fact hence does not need further explanation. 
The evidence for p being true in P 2 contains two nodes labeled by p and q 
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respectively and an edge from p to q, meaning that p is true because q is true. 
Program P 3 encodes a circular reasoning, therefore p is false. The evidence of p 
being false is a loop from p to q then back to p. Formally, 

^Pi(p) = ({(P) true)},$) 

= ({(P 7 true),{q,true)},{{p^ q)}) 

^Pshp) = {{{pJalse),{q, false)}, {{^p^ ~^q),hq ^ ^p)}) 

Example 2. In the following logic program, the predicate reach{A, B) defines 
the reachability from node A to node B, and arc{From,To) encodes the edge 
relation of a graph. 

table reach/2. arc(a,b). 

reach(A,B) arc(A,B). arc(b,a). 

reach(A,B) arc(A,C), reach(C,B). arc(c,a). 

The evidences for reach{a, a) being true and reach{a, c) being false are depicted 
in Figure 1. 




(a) reach(a,a) being true 




Fig. 1. Two evidence examples in reachability program 



3 Evidence Generation for True Literals 

The key idea of online evidence generation is to generate the evidence for a literal 
while the literal is being evaluated. A simple way to implement this idea is to 
extend each clause 



a /3i,...,/3^,store_evid(a, /3^]). (1) 

where a' and /3' are same as a and f3i, respectively, but indicate transformed 
predicates. When the query to a literal L = aO succeeds, the successors of L in 
the evidence, [PiO , . . . , f3m0], are recorded by store_evid. Note that store_evid 
simply records a successful application of a clause and hence does not change 
the meaning of the program. 
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Unfortunately, this simple technique may generate more information than 
necessary for the purpose of evidence. Recall that an evidence carries only the 
relevant information to establish a literal’s truth or falsity, therefore a back- 
tracked call should not be part of the explanation for the final true answer. But 
the above transformation stores evidence for calls to a that are backtracked in 
a higher-level call. 

Evidence as Explicit Arguments. We solve the above problem by passing the 
evidence as an argument of the predicates. We add an argument E to each 
atom a to form a new atom o'e which returns the evidence for a in E. Suppose 
a = p{t\, . . . ,tn), then o'e = . . . , E), where p' is a new name uniquely 

selected for p . The clause 

p(ti , . . . , ) . ~ j3i , . . . , fdjYi . 

is then transformed to 

p (tl , . . . , E) : - , . . . , ^ ~ [(/^li El), ... , (/3m J E^)]. (2) 

where E,Ei, . . . ,Em are distinct new variables. The last statement in the clause 
combines the evidence for the subgoals to form the evidence for the top-level 
query, thus effectively building a tree. Similar to the first transformation, the 
new predicate p' /{n + 1) executes the same trace as the original predicate p/n. 
Since the evidence is now returned as an argument, backtracked predicates no 
longer generate unnecessary information. 

Evidence for Tabled Predicates as Implicit Arguments. The situation becomes 
complicated when tabled predicates are present. Because all successful calls to 
tabled predicates are stored in tables, the space is not reclaimed when the pred- 
icates are backtracked. Furthermore, additional arguments for tabled predicates 
can increase the table space explosively [21]. Therefore there is no saving in 
passing evidence as an argument in tabled predicates. 

To avoid the overhead, we do not pass evidence as arguments in tabled pred- 
icates. Instead, we use the store_evid method in Clause (1) to store a segment 
of evidence in database, where the evidence for a tabled subgoal /3, is recorded 
as ref {Pi) pointing to Pfs record in the database, and the evidence for a tabled 
subgoal is the same as in Clause (2) . 

The complete transformation rule for true literals is shown in Figure 2. 

Note that store_evid always succeeds, therefore adding it to the original 
predicate does not change the program’s semantics. And by induction on the 
size of evidence, we have: 

Proposition 1. Let E be a fresh variable. Eor any literal L G P 

— the query L'e succeeds if and only if L succeeds 

— if the query L'e succeeds, then E returns an evidence for L being true. 
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For each clause p(ti, . . . ,tn) Pi, , Pm. 

— If p/n is non-tabled, transform the clause to 



P (^ 1 5 ■ ■ ■ ) tn 5 E) • PlEi 5 • ■ * ! PmE,^ 5 

E = \{Pl, El), , {Pm, Em)]. 



— If p/n is tabled, transform the clause to 



P (ll 5 • ■ ■ ) trP} . Pie\ 5 • • * ! PmE^ 5 



store_evid(p(ti, . . . ,t„), [(/?i,Ei), . . . , (/3m,Em)]). 



where E, Ei, . . . , Em are distinct new variables, and 
for each P = q{ui, . . . ,ui). 




q' {ui , . . . , Ui, E) if /? is a non-tabled predicate 

(q'(ui, . . . , Mi), E = ref{P)) if /3 is a tabled predicate 



Fig. 2. Transformation Rules for True Literals 



Example 3. The program in Example 2 is transformed as follows. 

table reach_t/2. 
reach_t(A,B) arc_t(A,B,E) , 

store_evid(reach(A,B) , [((arc(A,B) ,true) ,E)] ) . 

reach_t(A,B) arc_t(A,C,El) , reach_t(C,B) , 

store_evid(reach(A,B) , 

[( (arc (A,C) .true) ,E1) , 

( (reach (C.B) .true) .ref (reach (C.B) ) ) ] ) . 
arc_t(a.b. [] ) . arc_t(b.a. [] ) . arc_t (c . a. [] ) . 

The query reach_t (a. a) will succeed with the evidence stored in two records: 

reach{a, a) — > ]((arc(a, b), true), []), {{reach{b, a), true), ref {reach{b, a)))] 
reach{b,a) — > [{{arc{b,a),true),[])] 

4 Evidence Generation for False Literals 

Evidence generation for false literals is more difficult than that of true literals, 
because when the extended predicates fail, they do not return any tracing in- 
formation. We solve this problem by justifying the negation of false literals. We 
present the solution in two steps. In the first step, for each literal L, we compute 
a dual literal L which is equivalent to ^L. In the second step, we apply the 
transformation rule for the true literals to L. 

Dual Predicates. Let p/n be a predicate, where p is the predicate name and n is 
the arity of p. We say that the predicate p/n is the dual predicate of p if p and 
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p return complementary answers for the same input, i.e. for any literal instance 
p{ti, . . .,tn), where ti,.. . ,tn are terms, p{ti, . . . ,t„) = . . . ,tn). 

Recall from the definition of evidence (Definition 1) that a literal L is false if 
for every clause a : — /3i, ■ ■ ■ , /3n such that L unifies with a, under any substitu- 
tion 0, there is some j < n such that j3i9 is false and for all 1 < z < j, PiO is true. 
This directly leads to the following definition of dual predicates. For the sake of 
simplicity, let p be defined using k clauses in the form of p{ti) : - ftp, ft, 2 - Then, 

p{x) pT(x), ...,^(x) 

where pi captures the failure of the z-th clause of p. Now each z-th clause fails if 
either the arguments of p do not match with the arguments of Pi, or ftp fails, 
or for every answer of ft,i, ft, 2 fails. This is captured by the following rule: 

pi{x) : - x^U 

pi{ti) Pip ; forall(ft,i,ft, 2 ) 
where the predicate f orall(ft , ft) encodes : P\9 ft0. 

Dual Definitions for Tabled Predicates. For a tabled predicate involving recursive 
definitions, however, the dual predicates defined in the above scheme is incorrect 
in general. We can view the above definition of the dual as being obtained from 
the completion of a logic program [11]. It is well known that completion has 
unsatisfactory semantics for negation. This impacts our justification algorithm. 
Consider the simple propositional program p : - p, where p/0 is tabled. The dual 
predicate produced by the above transformation rule is p_f p_f. Because 
there is a loop in the definition of p_f , if p_f is not tabled, the query does not 
terminate; if it is tabled, it will give the answer false. However, since p fails, p_f 
should succeed. 

To break the loops in dual predicates, we use the explicit negation of the 
tabled predicates instead of their duals in the definitions. In XSB, that a tabled 
predicate a has no answer, i.e. —Pi9 : a9, is encoded by sk_not(a). In other 
tabled LP systems such as TABS and B-Prolog, the operator for tabled negation 
is the same as for non-tabled cases. Here we use table_riot to represent this 
negation operator for all tabled systems. For the above example, we replace p_f 
in the body of of dual for p by table_not (p) . Now the dual predicate becomes 
p_f table_not (p) , so the query p_f correctly succeeds. 

The benefit of using table_not for tabled predicate in the dual definitions 
is that there are no recursive calls to the duals of tabled predicates, hence the 
duals need not be tabled. This not only avoids cycle checking but also enables 
us to implement the justifier as a zero-storage producer, as the evidence does 
not consume permanent space when being passed as an argument. 

Generating Evidence for the Dual Predicates. We apply the transformation rule 
for true literals presented in Section 3 to the dual predicates, so that for each 
predicate a, we generate an extended dual predicate o'e that returns the evidence 
for ~^a in the variable E. 
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The two predicates introduced during dualization, forall and table_not, 
need special treatment in the transformation. 

— We implement a predicate all_evid((/3^g^ , Ei), E 2 ), E) to computes the 
evidence of f orall(/3i, /? 2 ): 

£(forall(/3i,/J2))= \J {(/3i0, £(/3i0)), (/?20, £(^20))} 

Ve:/3iS 



where Ei, E 2 , and E hold the evidences of f3\9, /320, and f orall(/3i, /? 2 ) re- 
spectively. 

~ To extend the predicate table_not (/3) , we apply the same technique used 
for true tabled predicates, i.e. to store only a pointer to the evidence for -i/3, 
denoted by ref{(}). Similar to the evidences for true literals, the evidences 
for false tabled literals are stored in segments. The evidence segment for ->/3 
can be generated by calling [3 g, which can be done at any time, giving us an 
algorithm of generating partial evidence on demand. The evidence is fully 
generated when the evidence segments to all referred literals are computed. 

The complete transformation rule for false literals is in Figure 3. A special 
case of this transformation is that when a predicate p/n has no definition (thus p 
trivially fails), a fact p'(Xi, . . . , X^, []). is generated according to Step 1, meaning 
p' trivially succeeds. 

Based on the definition of dual predicates and Proposition 1, we can establish 
the consistency of the transformed predicate. Also by induction on the size of 
evidence, we have: 

Proposition 2. Let E be a fresh variable. For any literal L G P, 

— the query L e succeeds if and only if L fails 

— if the query L e succeeds then E returns an evidence for L being false. 

Example 4- The program in Example 2 is transformed as follows. 

reach_f (A,B,E) : - reach_f 1 (A,B ,E1) ,reach_f 2(A,B ,E2) , concat ( [El ,E2] ,E) . 

reach_f 1(A,B,E) arc_f (A,B ,E) . 

reach_f2(A,B,E) arc_f (A,C,E) . 

reach_f 2(A,B,E) all_evid((arc_t(A,C,El) ,E1) , 

(reach_f (C , B) , E2=ref (reach (C , B) ) ) , E) . 
arc_f(A,B,E) arc_f 1(A,B,E1) , arc_f 2(A,B,E2) , arc_f3(A,B,E3) , 

concat ( [El ,E2,E3] ,E) . 
arc_f 1(A,B, [] ) (A,B) \= (a,b) . 

arc_f2(A,B, [] ) (A,B) \= (b,a). 

arc_f3(A,B, [] ) (A,B) \= (c,a). 

The query reachJ (a, c ,E) will succeed, returning one evidence segment: 

reach{a, c) — > [((arc(a, c), false), []), ((arc(a, b),true), []), 

{{reach[b, c), false), ref{reach{b, c)))] 
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For each predicate p/n whose definition is composed of k clauses in the form of 

^ j •“ /^*,1 5 ■ ■ ■ 5 • 

where 1 < i < fc, the extended dual predicate p' is defined in two parts: 

1. The top-level predicate is 

p'(Xi,...,Xn,E) p'i(Xi,...,Xn,Ei),...,p';,(Xi,...,Xn,EO, 

concat ( [El , . . . , Ei;] , E) . 

where concat ( [Ei, Eij], E) is a predicate that concatenates Ei,...,Eii to- 
gether to a single list E. 

2. For each 1 <i <k, the predicate pk / (n -|- 1) is defined by two clauses: 

p.(Xi,...,X„Q) :- not((Xi,...,Xn) = 

. . . ,ti,„,E) :- fevid{[l3i , 1 ; ■ ■ ■ 7 ] 7 E) . 

where fevid is a macro recursively defined as 
/ewd([],E) fail 

fevid{[(3i\B],E) =■'’ -> E = [{{Pi, false), eI)] 

; all_evid((/3[Et,El), {fevid {B, eI),eI),E). 

and EjEijEj, and Ej are distinct new variables. For each atom P = q{ui, . . . , ui) 
appearing in the body of a clause, its extended dual expression is defined as 

-p' _ j (table_not(g'(ui, . . . ,m;)),E = ref{P)) (if /3 is a tabled predicate) 

^ \q'{ui,...,ui,E) (otherwise) 



Fig. 3. Transformation Rules for False Literals 



To produce the evidence segment for reach{b, c) , we call reach J (b , c , E) , which 
returns 

reach{b, c) — *■ [{{arc{b, a), true), []), {{reach{a, c), false), ref {reach{a, c)))] 
Now since all referred literals have been visited, the evidence is fully generated. 

5 Practical Aspects 

In Sections 3 and 4, we described general transformation rules for pure logic 
programs. In practice, however, most programs have non- logical constructs (such 
as assert, retract) and meta-operators (such as var, nonvar). To make online 
justification practical, below we describe transformation techniques necessary to 
handle such programs. In addition, we show how online justification can be used 
as a flexible evidence explorer. 
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Meta- operators. Since we transform predicate names into different names using 
p' and p' , literal call(P) in the original program has to be transformed so 
that it calls the appropriate transformed literal during execution. If P is a non- 
variable, then it is transformed using the general transformation rules; otherwise, 
it is transformed into call'{P) or call (P), depending on whether call is being 
transformed as true literal or false literal, respectively. calV{P) and call (P) are 
part of run-time support for the justifier, which transforms P during execution 
and call the resulting predicate. 

XMC model checker uses predicate forallCVars, Antecedent, Conse- 
quent) which is defined as 

forall(_Variables, Antecedent, Consequent) 

findall (Consequent , Antecedent, AllConsequents) , 
all_true(AllConsequents) . 

all_true( [] ) . 

all_true( [C I Cs] ) call(C), all_true(Cs) . 

If forall is transformed using the general transformation rules, then 
all_true generates quadratic amount of evidence. The transformed predicates 
for forall should first collect evidence for antecedent from findall, evidence 
for consequents from all_true and then assemble both to generate the evidence 
for forall. There is no general technique to handle this behavior; hence we 
implemented these transformed predicates by hand. 



Controlling Evidence. As can be seen in Section 6, the transformed program 
has very little time overhead, but in some cases, the space overheads may be 
high; e.g., justification of non-tabled predicates such as append take exponen- 
tial amount of space to store the evidence due to recursive definitions. To avoid 
generating evidence for such predicates, we provide a mechanism to specify the 
predicates the user intends to justify and transform only those predicate defini- 
tions. Limiting the amount of evidence not only results in reducing the overheads, 
but also helps the user explore the evidence easily. 



Constructs with Side-effects. Pure logic predicates have no side-effects, hence 
can be executed many times, without changing the semantics of the program. 
However, assert and retract cannot be executed more than once, as they 
change the semantics of the program. For true literals, the evidence is generated 
during the evaluation of the program, which can be retrieved without executing 
the literal any more. The dual definitions for the literals, on the other hand, 
should not call assert and retract, as the original program wouldn’t have 
executed them. To avoid changing the database during execution of false literals, 
the predicates with assert/retract can be transformed in such a way that 
they first make a copy of the current database into another database, during the 
execution change only the copy and at the end delete the copied database. Thus, 
executing the dual definition doesn’t change the database. 
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Flexible Evidence Generation and Exploration. The segments of evidence gener- 
ated during query evaluation in online justification can then be used to generate 
the full evidence graph. So online justification can be used as a tool to debug 
programs. However, it has a few fundamental differences with the traditional 
trace-based debuggers. Justification gives a declarative view of the logic pro- 
gram, displaying sufficient and necessary information to establish the truth or 
falsity of a query, whereas debuggers provide only the procedural view of the 
execution. Online justification gives flexibility in both generating and exploring 
the evidence generated during evaluation by allowing the user to explore the 
evidence as and when necessary, skipping over uninteresting portions and even 
revisiting the skipped portions later without restarting the debugging process. 

6 Experimental Results 

One of the primary goals of our work is to implement a practical tool for justifi- 
cation of tabled logic programs. To measure the overheads of time and space on 
such programs, we have used the justifier to automatically transform the XMC 
model checker[15] to construct the evidence for model checking. The entire im- 
plementation of the model checker consists of about 150 lines of XSB, most of 
which are definitions for the non-tabled predicate models (S,F), which checks if 
state S models the ^-calculus formula F, and one definition for a tabled predi- 
cate rec_models (S , F) , which checks if state S models formula definition of F. A 
fragment of this model checker is given below: 

table rec_models/2 . 

rec_models(State_s, X) fDef(X, Y) , models (State_s , Y) . 

models (_State_s, tt) . 

models (State_s, fAnd(X_l, X_2)) 

models (State_s, X_l) , models (State_s , X_2) . 
models (State_s, fOr(X_l, X_2)) 

models (State_s, X_l) ; models (State_s , X_2) . 
models (State_s, fDiam(Act_a, X)) 

transition(State_s , Act_a, State_t) , models (State_t , X). 

Since models (S,F) is a non-tabled predicate, the justifier transforms it into 
two predicates: models_t(S,F,E), corresponding to the true-literal justification 
and models jf(S,F,E), corresponding to the false-literal justification, where E 
is the evidence. The recjnodels is transformed so that the evidence of the 
definition and the evidence from models/2 are stored in the evidence database. 
The transformed model checker is then used on some examples from XMC test 
suite [6] with various system sizes: Iprotocol, Leader election, Java meta-lock 
and Sieve. The system sizes that we have tried are very large (requiring upto 
1GB of system memory). Here we report the time and space performance of this 
model checker along with the original XMC model checker. 

All the tests were performed on Intel Xeon 1.7GHz machine with 2GB RAM 
running RedHat Linux 7.2 and XSB version 2.5 (optimal mode, slg-wam with 
local scheduling) with the garbage collector turned off. 
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Figure 6(a) compares the query evaluation time of the original XMC (without 
justification) with the transformed XMC (with justification). In our experiments, 
the transformed program took at most 50% (and on average 23%) longer time 
compared to the original program. Note that all the graphs are drawn in loga- 
rithmic scale, with performance of XMC without justification on X-axis and the 
performance of transformed XMC with justification on Y-axis. 
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Fig. 4. Time and Space Performance of XMC with and without Justifier 



Figure 6(b) shows the space overhead due to evidence generation. In 
our experiments, the transformed program has maximum overhead of II 
times when evidence is generated for every state along every path in the 
model (due to forall) in the case of Leader election protocol, and on 
average about 3.5 times the overhead. The comprehensive details about 
time and space performance of the online justifier tool can be found at 
http : //www. Imc . cs . sunysb . edu/~lmc/ justifier . 

7 Conclusions 

In this paper, we presented a new justification scheme using program transfor- 
mation. In this scheme, a logic program is automatically translated such that 
the translated program builds evidence during query evaluation. The evidence 
so generated can be presented later to the user using an interactive interface. 
The extra overhead due to the evidence generation is so little that the tool we 
implemented using this scheme has been used in practice to generate evidence 
for model checking practical systems. We plan to extend our scheme to handle 
logic programs with side effects, and to integrate our implementation with the 
Evidence Explorer [5], so that the user can easily navigate the evidence. 
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Abstract. Although negation is an active area of research in Logic Program- 
ming, sound and complete implementations are still absent from actual Prolog 
systems. One of the most promising techniques in the literature is Intensional 
Negation (IN), which follows a transformational approach: for each predicate p 
in a program its negative counterpart intnegip) is generated. However, imple- 
mentations of IN have not been included in Prolog environments due, in part, to 
the lack of details and explicit techniques, such as the treatment of universally 
quantified goals. In this paper, we describe a variant of IN, which we have called 
Constructive Intensional Negation ( CIN). Unlike earlier proposals, CIN does not 
resort to a special resolution strategy when dealing with universally quantified 
formulae, which has been instrumental in having an effective implementation. 
Among the contributions of this work we can mention a full implementation be- 
ing tested for its integration in the Ciao Prolog system and some formal results 
with their associated proofs. 

Keywords: Negation, Constraint Logic Programming, Program Transformation, 
Logic Programming Implementation, Constructive Negation. 



1 Introduction 

Kowalski and Colmerauer’s decision on the elements of first-order logic supported in 
Logic Programming was influenced by the availability of implementation techniques 
and efficiency considerations. Among those important aspects not included from the 
beginning we can mention evaluable functions, negation and higher order features. All 
of them have revealed themselves important for the expressiveness of Prolog as a pro- 
gramming language, but, while in the case of evaluable functions and higher order fea- 
tures considerable effort has been invested both in the semantic characterization and its 
efficient implementation we cannot say the same of negation. Many research papers do 
propose semantics to understand and incorporate negation into logic programming, but 
only a small subset of these ideas have their corresponding implementation counterpart. 

In fact, systems such as XSB[4], DLV[1] or SMODELS[3] do implement the well- 
founded or the stable model semantics, but represent a departure from standard Prolog. 
On the other hand, the negation features incorporated in actual Prolog compilers are 
rather limited, namely: the (unsound) negation as failure rule, and the sound (but in- 
complete) delay technique of the language G6del[14], or Nu-Prolog[20] (having the 

* This research was partly supported by tbe Spanish MCYT project TIC2000-1632. 
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risk of floundering.) The constructive negation of ECLiPSe[2], which was announced 
in earlier versions has been removed from recent releases due to implementation errors. 

The authors have been involved in a project [17, 18, 19] to incorporate negation 
in a real Prolog system (up to now Ciao [8], but the techniques are easily applicable 
to any other system). This allows us to take advantage of state-of-the-art, WAM-based 
Prolog systems, as well as to reuse thousands of Prolog lines of code. The basis of our 
work is to combine and refine existing techniques to make them useful for practical 
application. Furthermore, we try to use the most efficient technique as possible in any 
particular case [19]. To help on this distinction, we need to use some tests to characterize 
the situation for efficiency reasons. To avoid the execution of dynamic tests, we suggest 
to use the results of a global analysis of the source code. For instance, the primary 
technique is the built-in negation as failure that can be used if a groundness analysis 
tells that every negative literal will be ground at call time [18]. 

In order to handle non-ground literals, a number of alternatives to the negation-as- 
failure rule have been proposed under the generic name of constructive negation: Chan’s 
constructive negation[9, 10, 22, 13], intensional negation[5, 6, 7], fail substitutions, 
fail answers, negation as instantiation{2\\, etc. From a theoretical viewpoint Chan’s 
approach is enough but it is quite difficult to implement and expensive in terms of 
execution resources. On the other hand, IN uses a transformational approach, so most 
of the work is performed at compile time and then (a variant of) the standard SLD 
resolution is used to execute the resulting program, so a significant gain in efficiency 
is expected. There are, however, some problems when transforming certain classes of 
programs. 

In this paper we focus on the study of an efficient implementation of IN. As for- 
mulated in the original papers, it is difficult to derive an efficient transformation. On 
one hand, universally quantihed goals generated are difficult to manage. On the other 
hand the operational behavior of the original program is modified computing infinitely 
many answers instead of compact results, unless disequality constraints are built into 
the Prolog system and this, and other details, have been missing so far. 

The following paragraphs describe existing work on intensional negation. 



1.1 Intensional Negation 

In Intensional Negation [5, 6] a program transformation technique is used to add new 
predicates to the program in order to express the negative information. Informally, the 
complement of head terms of the positive clauses are computed and they are used later 
as the head of the negated predicate. We will denote the negated predicate of p as 
intneg-p. The following example is taken from [5]: 

even(S) «— intneg_even(s(8)) <— 

even(s(s(X))) «— even (X) intneg_even(s(s(X))) «— intneg.even(X) 

The predicate intneg-even succeeds when even fails and vice versa. 

Our transformation approach basically follows the ideas of [6], but differs on some 
significanf poinfs. For the detailed description of the transformation, Barbuti and his 
co-authors apply the transformation to a restricted class of programs. Then they show 
that any program can be translated into a program of that class. Despite the elegance 
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and simplicity of the presentation, this description is not very helpful for practical im- 
plementation. 

There are two problems with this technique. The first one is that in the presence 
of logical variables on the right-hand side (rhs) of a clause, the new program needs 
to handle some kind of universal quantification construct. The second point affects the 
outcomes of the program: while the new program is semantically equivalent to the com- 
pleted program, the operational behavior can differ. In the presence of logical variables, 
the new predicate can generate all the possible values one by one, even when a more 
general answer can be given in an expanded language (with disequality constraints). 
The predicate pjl defined by the single clause p{X, X) is negated by: 

intnegCp)(X, Y) «— intneg(eq) (X, Y) 
intnegCeq) (0, s(Y)) 
intneg (eq) (s CX) , 0) 

intnegCeq) Cs(X) , s(Y)) «— intneg (eq) (X, Y) 

if the program only works with natural numbers constructed with 0 and succ/1. The 
query intneg(p){X, Y)) will generate infinitely many answers, instead of the more gen- 
eral constraint X + Y . An answer like X + Y can only be replaced by an infinite number 
of equalities. 



1.2 Compilative Constructive Negation 

To cope with the second problem it is possible to use explicitly the formulas in the 
program completion as rewrite rules with a suitable constraint extraction procedure. 
This forces us to interpret the transformed program with in a Constraint LP framework, 
working with equality and disequality constraints over the Herbrand domain. Compila- 
tive Constructive Negation of [7] follows the ideas of constructive negation but intro- 
duces constraints for the left-hand side of the predicate (and, therefore, for its negation). 
The description of the program transformation is in the following dehnition. 

Definition 1 (Negation Predicate). Given a predicate p defined by m clauses: 

Cl . pit \) < Ai , ... CfYi . pQm) ^ 

where each Bj — G'j,...,G'-^ is a collection of literals and Aj — Gjy, . . . ^ 

a collection of literals with no free variables (i.e. Vars(Aj) e VarsQj)), its Negated 
Predicate is defined as 

—<piX) < > (VFi.— ici V —<Ai V ~<Bi) A ■ ■ ■ A (VT,„.— iCm V —<A,„ V ~<B,f) 
where Y j — free-var ( Cj) and c j is the constraint X — Ij. 

We can rename the conjunction as ^p(X) F[ A ■ ■ ■ A f As for each j we have 

F'j = VT j-i~'Cj V ~'Aj V ~'Bf) = VT jf—iCj) V VT j-i~'Cj V ~<B V ~<Aj 
= VT j.(—iCi) V VT j-i~'Cj V ~'G'j ^ V ■ ■ ■ V —iG'j^,) V ~'Gji V ■ ■ ■ V —iGj^kj 
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then we compute the disjunctive normal form (Fi V ■ ■ ■ V F;,) equivalent to the formula 
(FJ A ■ ■ ■ A F'„) and replace in F, the occurrences of ~^q by compneg-q, obtaining F*. 
The new program CompNeg{P) contains, for each p, the clauses; 

compneg-p(X) <— F* ... compneg_p(X) <— FjJ 

Universally quantified subgoals resulting from the negation of clauses with free vari- 
ables are evaluated by means of a new operational semantics called SLo'^-resolution. 
The authors prove the correctness and completeness of the new semantics w.r.t. the 3- 
valued program interpretation, that in our context means that the result is equivalent 
to the em Intensional Negation concept of definition 2. However, the introduction of a 
new operational mechanism contradicts the compilative nature of the method, makes 
difficult an efficient implementation and breaks our goal of the combination of negated 
goals with other Prolog modules already developed and tested. Furthermore, the com- 
pilation scheme is very naive: more than one clause can be created for the same goal, a 
lot of useless universally quantified goals are generated and the programs contains a lot 
of trivial constraints . 

The rest of the paper is organized as follows. Section 2 introduces basic syntactic 
and semantic concepts needed to understand our method. Section 3 formally presents 
the transformation algorithm of the Constructive Intensional Negation technique. In 
section 4 we discuss our universal quantification definition and we provide implemen- 
tation details in section 5. Finally, we conclude and discuss some future work (section 
6). Proofs of relevant theorems and lemmata can be found in the appendix'. 



2 Preliminaries 

In this section the syntax of (constraint) logic programs and the intended notion of cor- 
rectness are introduced. Programs will be constructed from a signature 2" = {FS 2 ;, PSj;) 
of function and predicate symbols. Provided a numerable set of variables V the set 
TermiFS^:, V) of terms is constructed in the usual way. 

A constraint is a first-order formula whose atoms are taken from TermiFS^, V) and 
where the only predicate symbol is the binary equality operator - 12. A formula -i(?i = 
? 2 ) will be abbreviated t\ + t 2 - The constants t and f will denote the neutral elements of 
conjunction and disjunction, respectively. A tuple (xi, . . . , x„) will be abbreviated by x. 
The concatenation of tuples x and y is denoted x ■ y. The existential closure of first-order 
formula <p is denoted 

A (constrained) Horn clause is a formula h(x) <— b\(y ■ z), ■ ■ ■ , bn(y ■ z)Qc(x ■ y) 
where x, y and z are tuples from disjoint sets of variables^. The variables in z are called 
the/ree variables in the clause. The symbols and “0” act here as aliases of logical 
conjunction. The atom to the left of the symbol ” is called the head or left hand side 
(Ihs) of the clause. Generalized Horn clauses of the form h(x) <— B(y ■ ^Dc(x ■ y) where 

' Available at ftp://lml.ls.fi.upm.es/pub/users/susana/papers/intneg_appendix.ps. Extended ex- 
amples are also included. 

^ The notation p(x) expresses that Vars(p(x)) e x, not that it is identical to p(xi, . . . , x„) 
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the body B can have arbitrary disjunctions, denoted by and conjunctions of atoms 
will be allowed as they can be easily translated into “traditional” ones. 

A Prolog program (in E) is a set of clauses indexed by p e PS^:: 

/7(x) <- 5i(yi •zi)Qci(x-yi) /?(x) <- • z„)Qc,„(x • y,„) 

The set of defining clauses for predicate symbol p in program P is denoted defpip). 
Without loss of generality we have assumed that the left hand sides in defp(p) are 
syntactically identical. Actual Prolog programs, however, will be written using a more 
traditional syntax. 

Assuming the normal form, let defpip) - [p(x) <— 5/(jj • z,)Dc,(x ■ y;)|i el... m]. 
The completed definition of p, cdefp(p) is defined as the formula 



The Clark’s completion of the program is the conjunction of the completed definitions 
for all the predicate symbols in the program along with the formulas that establish the 
standard interpretation for the equality symbol, the so called Clark’s Equality Theory or 
CET. The completion of program P will be denoted Comp(P). Throughout the paper, 
the standard meaning of logic programs will be given by the three-valued interpretation 
of their completion - i.e. its minimum 3-valued model, as defined in [15]. These 3 
values will be denoted t (or success), f (or fail) and u (or unknown). 

Example 1. Let us start with a sample program (over the natural numbers domain) to 
compute whether a number is even. It can be expressed as a set of normalized Horn 
clauses in the following way: 

even(X) <— [| X=S. 

even(X) «— even(N) 0 X=s(s(N)). 

Its completion is CET A Vx. {even{x) <=> x = 0 V 3n. x - s(s(n)) A even(n)] . 

We are now able to specify IN formally. Given a signature E - {FS^, PS^j), let 
PS'j. D PSj: be such that for every p e PSj; there exists a symbol neg(p) e PSf \ PSj;. 
LetE' = {FS 2 :,PS'^). 

Definition 2 (Intensional Negation of a Program). Given a program P^:, its inten- 
sional negation is a program P'^, such that for every p in PSz the following holds: 



3 Our Transformation 

Our transformation differs from the above methods [5, 6, 7], as we have tried to keep 
the transformed program as close (syntactically) as possible to the original one (i.e. 
minimize the number of extra clauses, trivial constraints, universally quantified goals, 
etc.) in order to optimize its behaviour. In this sense, the translation scheme is not trivial 
and some simplifications can be obtained by taking into account some properties that 
are often satisfied by the source program. For instance, the clauses for a given predicate 
are usually mutually exclusive. The following definition formalizes this idea: 




m 



Vx. [CompiP) 1=3 p(x) <=> Comp(P') [=3 -n(negip)(x))] 
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Definition 3. A pair of constraints c\ and C 2 is said to be incompatible iff their con- 
junction Cl A C 2 is unsatisfiable. A set of constraints {c,) is exhaustive ifffiCi = t. A 
predicate definition def p{p) - {p(3c) «— -ZiOOc/Cx-y,)!/ e \ . .m] w nonoverlapping 
ijf'ii, j & I .. .m the constraints 3y,-. c,(x • yf) and cjfx ■ yf) are incompatible (so, 
i + j) and defp(p) is exhaustive if the set of constraints {By,-. c,(x • y,)) is exhaustive. 

In the following, the symbols and “||” will be used as syntactic sugar for “A” 
and “V” respectively, when writing completed definitions from nonoverlapping and ex- 
haustive sets of clauses, to highlight their behaviour as a case-like construct. A nonover- 
lapping set of clauses can be made into a nonoverlapping and exhaustive set by adding 
an extra “default” case: 

Lemma 1. Let p be such that its definition def p(p) - {p(x) <— • Z/)Dc,(x • y,)|i 6 

I .. .m} is nonoverlapping. Then its completed definition is logically equivalent to 

cdefp(p) = Vx[p(x) Byj. ci(x-yi) -» Bzi.Bi(yi -zi) II 



Cm(x ■ yj Bz,„.B„,(y„ • Zm) II 

Ar=i-3j,- Q(I-y,)^f] 

The interesting fact about this kind of definitions is captured by the following 
lemma: 

Lemma 2. Given {Ci, ..., C„) a set of exhaustive and nonoverlapping Herbrand con- 
straints and {Bi, B„) a set of first-order formulas with no occurrences ofx and y,- a 
set of variables included in the free variables ofCi, i e [l..n] the following holds: 

VI(^[Byi(Ci ^ Bi) II - ■ -11 3y„(C„^ B„)] ^ By^Ci ^ -Bi) || - ■ - || 3y„(C,.^ -B„)) 

This means that the overall structure of the formula is preserved after negation, 
which seems rather promising for our purposes. 

The idea of the transformation to be defined below is to obtain a program whose 
completion corresponds to the negation of the original program, in particular to a rep- 
resentation of the completion where negative literals have been eliminated. We call this 
transformation Negate. 

Definition 4 (Constructive Intensional Negation of a predicate). Let p be a predi- 
cate of a program P and let defp(p) — Yx. [p(x) <=> D] be its completed definition. 
The constructive intensional negation ofp, Negate(p), is the formula obtained after the 
successive application of the following transformation steps: 

1. For every completed definition of a predicate p in PS^, Yx. [p(x) D], add the 

completed definition of its negated predicate, Yx.[negq}(x) ~^D]. 

2. Move negations to the right of ” using lemma 2. 

3. If negation is applied to an existential quantification, replace -i3z.C by Yz.-<C. 

4. Replace ~^C by its negated normal form^ AWB(-iC). 

5. Replace -it by f, -if by ^for every predicate symbol p, ^p(f) by neg 4 >(f). 



^ The negated normal form of C, is obtained by successive application of the De Morgan laws 
until negations only affect atomic subformulas of C. 
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Definition 5 (Constructive Intensional Negation of a program). Let P be a program, 
the constructive intensional negation of P will be the transformation of P into another 
program consisting of the constructive intensional negations of all predicates of P. 

Lemma 3. Let the formula Negate(P) be the constructive intensional negation of a 
program P. Then, for every predicate symbol p 6 PS^ of P, 

Negate(P) [=3 ('Lx.neg 4 >(x) <=> ~^p(x)). 

Definition 6. The syntactic transformation negate-rhs is defined as follows: 

negate_rhs{P; Q) - negate -rhs{P), negate-rhs(Q) 
negate-rhs{P, Q) - negate -rhs{Pf, negate-rhs(Q) 
negate-rhs{t) - f 
negate-rhs{f) — t 
negate -rhs{p(f)) - neg 4 >(f) 

Definition 7 (Constructive Intensional Negation). For every predicate p in the orig- 
inal program, assuming def p{p) - [pfx) <— • zdWciix ■ yi)\i 6 I . . .m] a nonover- 

lapping and exhaustive definition, the following clauses will be added to the negated 
program: 

- If the set of constraints {3yj.Ci(x.yj)} is not exhaustive, a clause 

m 

neg^(x) <- D /\ 

1 

- Ifzj is empty, the clause neg 4 >(x) <— negate -rhs{B j(y f))]\c jfxly f) 

- Ifzj is not empty, the clauses neg^(T) forall(zj, P-j(jj ' zf))^Cj(xlyf) 

p-j(yj • Zj) <- negate j-hs{Bj(yj ■ zj)) 

We can see that negating a clause with free variables introduces “universally quan- 
tified” goals by means of a new predicate /oraZ//2 that is discussed in section 4 where 
soundness and completeness results are given. Implementation issues are given in sec- 
tion 4. In the absence of free variables the transformation is trivially correct, as the 
completion of the negated predicate corresponds to the negation-free normal form de- 
scribed above. 

Theorem 1. The result of applying the Constructive Intensional Negation transforma- 
tion to a (nonoverlapping) program P is an intensional negation (as defined in def. 2). 

3.1 Overlapping Definitions 

When the definition of some predicate has overlapping clauses, the simplicity of the 
above transformation is lost. Rather than defining a different scheme for overlapping 
rules, we will give a translation of general sets of clauses (that can be applied to any 
predicate definition) into nonoverlapping ones: 
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Lemma 4. Let p be such that def p{p) - {/?(x) <— • z,)0c,(x ■ yi)\i e I . . .m] and 

there exist j,k e 1 . . . m such that 3y j. Cj(x ■ yj) and Bj/i- Ck(x ■ yj,) are compatible. Then 
the j-th and k-th clauses can be replaced by the clause 

p(x) <- Bj(yj ■ Zj)\ Bk(yt ■ Zk)\\cj(x ■ yj) A Ck(x ■ y^) 

and the following additional clauses (if the new constraints are not equivalent to fj; 

p{x) «- Bj(jj ■ zf}W,Cjfx ■ yj) A ^3y^.Ck(x ■ y^) 

p(x) <- Bk(y^ ■ Zk)W^^yj-Cj(x ■ yf) A q(x ■ 

without changing the standard meaning of the program. The process can be repeated if 
there are more than two overlapping clauses. It is clear that it is a finite process. 

3.2 Examples 

Let us show some motivating examples: 

Example 2. The predicate lesser/2 can be defined by the following nonoverlapping 

set of clauses 

lesser(S,s(Y)) 

lesser(s(X) , s(Y)) «— lesser(X,Y) 

or, normalizing, 

lesser(N,M) «— [] N=8, M=sCY) 

lesser(N,M) «— lesser(X,Y) O N=s(X), M=s(Y) 

By lemma 1, the completed definition of lesser is 

cdef piles ser) = 'in, m. lesser(n, m) <=> 3y.n — 0 A m — s(y) -» t || 

3x,y.n - s(x) Am — s(y) -» lesser(x,y) || 
m = 0 ^ f 

We have assumed that the constraint -i(3y.n - Q Am - siy)) A -i(3r, y.n - s(x) Am - 
s(y)) has been simplified to m = 0 (if the only symbols in the program are 0 and i/ 1). 

And the generated Prolog program for neg_lesser is: 
neg_lesser(N,M) «— neg_lesser(X,Y) 0 N=s(X), M=s(Y) 
neg_lesser(N,M) «— 0 M=8 

Example 3. The second example, family, is also well-known, and includes free vari- 
ables on the rhs of the definition of the predicate ancestor/2: 

parent! John, mary) ancestorCX, Y) <— parent (X, Y) 

parentCjohn, peter) ancestorCX, Y) «— parent(Z, Y) A 

parent Cmary , joe) ancestorCX, Z) 

parent Cpeter, susan) 

As ancestor/2 is defined by a pair of overlapping clauses, we get: 
ancestorCN,M) «— parentCX,Y) ; CparentCZ,Y) , ancestor CY,X)) 0 N=X, M=Y 
The corresponding Prolog program for neg_ancestor is: 
neg_ancestorCN,M) «— neg.parentCX, Y) , 

£orallC[Z] ,neg_parentCZ,Y) ; neg_ancestorCY,X)) Q N=X, M=Y 
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4 Universal Quantification 

The main difference between our method and proposals like [10, 7] lies in a treatment 
of universal quantification that works by exploring the Herbrand Universe, making use 
of the following: 

1 . A universal quantification of the goal Q over a variable X succeeds when Q suc- 
ceeds without binding (or constraining) X. 

2. A universal quantification of Q over X is true if Q is true for all possible values for 
the variable X. 

This results in a mutually recursive rule that can be expressed formally as follows: 

VX Q{X) = Q(sk) V [V^. QiciiTi)) A • ■ ■ A V^. Q(Cn(X„))\ 

where FSj: - |ci . . . c„) is the set of function symbols and sk is a Skolem constant, 
that is, sk i FSj;. In practice, the algorithm proceeds by trying the Skolem case first 
and, if this fails, then it expands the variables in all possible constructors. The follow- 
ing definitions try to capture some tricky details of the procedure, which are necessary 
for ensuring its completeness. For reasons of simplicity, we will only consider quan- 
tifications of the form Vx.Q where Q has neither quantifiers nor free variables in the 
following (apart from x), but the results generalize to the case of quantification over 
several variables. 

Definition 8 (Covering). A covering of Term{FS^, V) is any finite sequence of terms 
{t\ . . . t„) such that: 

— For every i, j with i + j, ti and tj do not unify, i.e. there is no ground substitution cr 
with ticr - tja. 

— For every ground term s of the Flerbrand Universe there exists i and a ground 
substitution cr with s — tier. 

The simplest covering is a variable C\ - fX). If a program uses only natural num- 
bers {FS 2 : = (0, i/1)), for example, C 2 = (0, i(X)), C3 = (0, i(0), i(i(Z))) and C4 = 
(0, i(0), i(i(0)), i(i(i(Z)))), etc., are also coverings. This example also suggests how 
to generate coverings incrementally. We start from the simplest covering X. From one 
covering we generate the next one, choosing one term and one variable of this term. 
The term is removed and then we add all the terms obtained replacing the variable by 
all the possible instances of that element. 

Definition 9 (Next Covering). Given a covering C - {ti, , t,„), the next covering of 
C over the variable X is defined as next{C,X) - {t\, . . . ,tj^i,tj+i, . . . ,t,„,tji, . . . ,tjn), 
where each tjk — tjcxk is obtained from the symbols in FSz by applying, for each q 6 
FSx, the substitution (Jk - [X Ck(Xf)\, where X^ n Vars(tj) — 0. We can say that a 
covering C is less instantiated that next{C, X) in the sense of [21 ]. 

Definition 10 (Sequence of Coverings). S - {C\, . . . ,C,i) is a sequence of coverings 
if Cl — {X}, for some variable X and for every i e {I .. .n — 1), C,+i = next(Ci,Xi), 
where Xj is the variable appearing in C; at the leftmost position. 
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Lemma 5. Given an infinite sequence of coverings S = (Ci , C 2 , . . .) and a ground term 
t 6 Term^, then there exists a finite n 6 N such that t 6 C„. 

The proof of this lemma is trivial for the definition of next covering. 

To use coverings as successive approximations of a universal quantification, it is 
necessary to replace their variables by Skolem constants: 

Definition 11 (Evaluation of a Covering). Given a covering C, we define skolem(C) 
as the result of replacing every variable by a different Skolem constant. The evalua- 
tion of C for the quantified goal '^X.Q is the evaluation (meaning) of the three-valued 
conjunction of the results of evaluating Q with all the elements in skolem(C). That is 

eval(C,'iX.Q) - I(Q[X i-> fi] A ■ ■ ■ A Q[X i-> f„,]) e {t,f,u) 
where skolem(C) — {t\ . . . t„,}. 

After the evaluation of a covering C, for a given quantified goal, several situations 
may arise: 

1. eval(Ci, 'iX.Q) - t. Ci is true for 'iX.Q, so we can infer that the universal quantifi- 
cation of Q succeeds. 

2. eval(Ci,'iX.Q) - u. C, is undefined for 'iX.Q, so we can infer that the universal 
quantification of Q loops (it is undefined). 

3. eval(Ci,'iX.Q) - f. According fo the three-valued semantics of conjunction this 

will happen whenever there exists some tj in skolem{Ci) such that Q[X f. 

We can consider two subcases: 

- There is some tj in skolem(Ci) which does not contain Skolem constants such 
that Q{X i-» tj\ f. Ci is false, so we can infer that the universal quantification of 
Q fails. 

- Otherwise, for every tj in skolem(Ci) such that Q[X tj] f, tj contains 
Skolem constants. We cannot infer that the universal quantification of Q fails, and 
eval(Ci+\,'iX.Q) must be considered. We will say that C, is undetermined for 'iX.Q. 
Next covering should be considered: evaliCi, VX0 = eva/(C,+i, vz.G) 



Definition 12 (forall). The evaluation of a universally quantified goal Q over a set of 
variables Vars, forallfVars, Q), is the process of evaluating the first covering Ci for 
SVars. Q, that is: 

forall{Vars, Q) = eval(Ci,YVars. Q) 

With this method, we construct a diagonalization to cover the domain of a set of 
variables, that is inspired by Cantor diagonalization. Cantor diagonalization is used 
to prove that the cartesian product of numerable factors (N™) is denumerable, e. g. it 
ensures that all elements are visited in a finite number of steps. So, it assures that the 
elements will be evaluated fairly during the generation covering and an infinite branch in 
the covering terms evaluation is impossible. It can be also applied in many decidability 
and computational results. This and other diagonalization methods are studied and used 
in [11]. 
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It is important to point out some characteristics of Kunen’s interpretation that are 
the key for getting completeness results for our approach. 

Something can succeed or fail at infinity (/^) in Fitting semantics, but if there is no 
finite set in which we can check the failure or success of a goal in Kunen semantics, the 
goal is undefined. If we consider the simple predicate natj 1 : 

nat{0) <— 

nat(s(X)) <— nat(X) 

we find that while VX natiX) succeeds in Fitting semantics (at I^), it is undefined in 
Kunen semantics, because it cannot be evaluated in a finite number of steps. That is, 
there is no n e N such that /„ |= VX.naf(Z). This is the Prolog interpretation. The reader 
is referred to Kunen’s paper [15] for a wider discussion on the subject. 

Similarly, VZ. nat(X) will fail in Kunen semantics iff 3X. ^nat{X), that is, if there 
exists a counterexample that we are able to find in a finite process. 

During this process, nontermination may arise either during the evaluation of one 
covering or because the growth of the sequence of coverings does not end. The follow- 
ing lemmata show that this does not affect completeness of the method: 

Lemma 6. Let S — (Ci,C 2 , ...) be a sequence of coverings. VX.Q is false iff there 
exists a finite « e N such that eval{C„, VX.Q) — f. 



Lemma 7. Let S — {Ci, C 2 , ...) be a sequence of coverings. If there is any C,- in S such 
that eval{Ci, 'iX.Q) — u, then eval{Cj+\, VZ.Q) = u. 



Lemma 8. Let S — {Ci,C 2 , ■ . ■) be an infinite sequence of coverings such that for every 
i > 0, Ci is undetermined for 'IX.Q. Then it can be proved that, under the three-valued 
semantics of the program the quantification, 'IX.Q is undefined. 

That is, loops introduced during the computation of coverings are consistent with 
the semantics of universal quantification. The case considered in the last lemma corre- 
sponds, essentially, to universal quantifications that hold at infinite iterations of Fitting’s 
transformer but not at any finite instance. 

Lemma 9. Let S — (Ci, C 2 , . . .) be a sequence of coverings. If 'IX.Q succeeds than 
there exists a finite « 6 N such that eval{C„, 'IX.Q) - t 



Theorem 2 (Completeness of forall). 'When IX.Q succeeds/fails in Kunen semantics, 
then forall(X, Q) succeeds/fails. 



Theorem 3 (Correctness of forall). Whenforcdl(X, Q) succeeds/fails in Kunen seman- 
tics, then 'IX.Q succeeds/fails. 
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5 Implementation Issues 

Disequality Constraints An instrumental step in order to manage negation in a more 
advanced way is to be able to handle disequalities between terms such as t\ t 2 - Pro- 
log implementations typically include only the built-in predicate \== /2 can only work 
with disequalities if both terms are ground and simply succeeds in the presence of free 
variables. A “constructive” behavior must allow the “binding” of a variable with a dise- 
quality. On the other hand, the negation of an equation X - t(Y) produces the universal 
quantification of the free variables in the equation, unless a more external quantihcation 
affects them. The negation of such an equation is V F. Z f(F). As we explained in [17], 
the inclusion of disequalities and constrained answers has a very low cost as a direct 
consequence of [9, 10, 22, 13]. It incorporates negative normal form constraints instead 
of bindings and the decomposition step can produce disjunctions. More precisely, the 
normal form of constraints is: 

/\(z, = f,) A (/\vz;..(F] ^^;.)v...v/\vz;'’.(f;'^.”)) 

^ ./ I 

positive information negative information 

where each Z, appears only in Z, = f,, none is F^' and the universal quantihcation 
could he empty (leaving a simple disequality). Using some normalization rules we can 
obtain a normal form formula from any initial formula. It is easy to redehne the unihca- 
tion algorithm to manage constrained variables. This very compact way to represent a 
normal form was hrstly presented in [16] and differs from Chan’s representation where 
only disjunctions are used"^. 

We have dehned a predicate =/=/2 [17], used to check disequalities, in a similar 
way to explicit unihcation (=/2). Each constraint is a disjunction of conjunctions of 
disequalities that are implemented as a list of lists of terms as Tj/Ta (that represents 
the disequality Ti + Ta). When a universal quantihcation is used in a disequality (e.g., 
VF. Z + c(F)) the new constructor /A/ 1 is used (e.g., XjcifAiY))). 



Implementation of Universal Quantification We implement universal quantification 
by means of a predicate /ora///2 that is the direct implementation of the predicate de- 
fined in definition 12. The execution of V([Zi, ..., Z„], Q) implies a covering evaluation 
evfl/(Ci,VW0. 

We have optimized the covering evaluation process to avoid repeating the evaluation 
of the same term in successive coverings^. 

We will keep a list with the content of the first covering and we will update the list 
with the elements of the second covering and so on. We will check at each step of the 
evaluation the value of the quantified goal for all elements of the covering at that step. 

But all ground terms of the covering (e.g., terms without Skolem constants) and 
terms with universal variables are only evaluated once. To perform this optimization, 

Chan treats the disjunctions by means of backtracking. The main advantage of our normal 
form is that the search space is drastically reduced. 

^ Covering in this section refers to coverings with Skolem constants. 
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we keep a queue (implemented as a list). Expanded terms, from one covering to the 
next one, will be added to the bottom of the queue and only these new terms should be 
evaluated in the next step of the evaluation. 

In the evaluation of a term t for the goal Q-. 

- \fQ[X t\-t then it will succeed in all the following coverings, so it is eliminated 
from the queue and we continue evaluating the other terms in the queue. 

- If Q[X i-» t] = f then 'iX.Q fails and the evaluation process finishes. 

- If Q[X r] = u then VX.Q loops. 

The key to the completeness of the process lies in the selection of the term of the 
queue to be evaluated first. 

If we use the left-to-right Prolog selection rule, then the implementation is incom- 
plete, although more efficient. This happens when we evaluate a term f and Q[X i-> r] = 
u and there is another term t' in the queue such that Q[X i-> f'] = f. So, in these cases, 
fomll{{X\, ...,X„], Q) loops instead of fails. 

This is the reason why it is important to use a fair selection rule to assure that if 
there is a failure evaluation of a term, then it will be found before looping in a undefined 
evaluation of another term of the queue. 

As we have seen in the definition of the quantifier, this implementation is correct 
and if we use a fair selection rule to chose the order in which the terms of the covering 
are visited then it is also complete. We can implement this because in Ciao Prolog is 
possible to ensure AND-fairness by goal shuffling [8]. 

If we implement this in a Prolog compiler using depth first SLD-resolution, the 
selection will make the process incomplete. When we are evaluating a covering C = 
{fi, ..., f,„) for a goal Q(X) if we use the depth first strategy from the left to the right, we 
can find that the evaluation of a Q{tj) is unknown and is there exists one k > j such 
that tk is ground and Qitk) fails, then we would obtain that/ora//([X], Q(X)) is unknown 
when indeed it is false. 

6 Conclusion and Future Work 

The experimental results are very encouraging. Table 1 includes some measurements 
of execution times (in milliseconds to get the first solution of a list of positive goals, 
their negation using negation as failure {naf) when applicable, their negation using 
our implementation of the constructive classical negation of Chan (cneg), and their 
negation using CIN iintneg). The ratios of the constructive negation and the CIN w.r.t. 
the positive goal and the ratio of the constructive negation w.r.t. CIN are shown. 

We present three set of examples. The first one collects some examples where it is 
slightly worst or a little better to use CIN instead of negation as failure. The second 
set contains examples in which negation as failure cannot be used. CIN is very effi- 
cient in the sense that the execution time is similar as the time needed to execute the 
positive goal. The third set of examples is the most interesting because contains more 



® All measurements were made using Ciao Prolog 1.5 on a Pentium II at 350 Mhz. Small pro- 
grams were executed a sufficient number of times to get respectable data. 
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goals - G 


G 


naf]G) 


cneg]G) 


ratio 


intneg]G) 


ratio 


cneg/intneg 


boole(l) 


780 


829 


1319 


1.69 


780 


1.00 


1.69 


positive(500) 


1450 


1810 


2090 


1.44 


3430 


2.36 


0.60 


positive(700) 


1450 


1810 


2099 


1.44 


5199 


3.58 


0.40 


positive] 1000) 


2070 


2370 


2099 


1.01 


8979 


4.33 


0.23 


less(0,50000) 


1189 


1199 


23209 


19.51 


6520 


5.48 


3.55 


less(50000,0) 


1179 


1140 


4819 


4.08 


10630 


9.01 


0.45 


average 








4.86 




4.29 


1.15 


boole(X) 


829 


- 


1340 


1.61 


830 


1.00 


1.61 


ancestor(peter,X) 


820 


- 


1350 


1.64 


821 


1.00 


1.64 


average 








1.63 




1.00 


1.62 


less(50000,X) 


1430 


- 


28159 


19.69 


6789 


4.74 


4.14 


less(100000,X) 


1930 


- 


54760 


28.37 


13190 


6.83 


4.15 


less(X,50000) 


1209 


- 


51480 


42.58 


12210 


10.09 


4.21 


less(X, 100000) 


1580 


- 


102550 


64.90 


24099 


15.25 


4.25 


add(X,Y, 100000) 


2219 


- 


25109 


11.31 


4209 


1.81 


5.96 


add] 1 00000, Y,Z) 


2360 


- 


12110 


5.10 


2659 


1.12 


4.55 


add]X,100000,Z) 


2160 


- 


23359 


10.81 


2489 


1.15 


9.38 


average 








26.10 




5.86 


5.23 



Table 1. Runtime comparison 



complex goals where negation as failure cannot be used and the speed-up of CIN over 
constructive negation is over 5 times better. 

CIN is just a part of a more general project to implement negation where several 
other approaches are used; negation as failure possibly combined with dynamical goal 
reordering, IN, and constructive negation. The decision of what approach can be used 
is fixed by the information of different program analyses: naf in case of groundness of 
statically detected goal reordering, finite constructive negation in case of finiteness of 
the number of solutions, etc. See [18] for details. Notice that the strategy also ensures 
the soundness of the method [19]: if the analysis is correct, the precondition to apply a 
technique is ensured, so the results are always sound. 

The main contributions of our CIN approach are: 

- The formulation in terms of disequality constraints, solving some of the problems 
of IN [6] in a more efficient way than [7] because of the definition of the universal 
quantification without using the program code. 

- The computation of universally quantified goals, that was sketched in [6], and we 
provide here an implementation and a discussion about its soundness and complete- 
ness. 

- To our knowledge it is one of the first serious attempts to include intensional nega- 
tion in a Prolog compiler. 

As a future work, we plan to include this implementation into a compiler in order 
to produce a version of Ciao Prolog with negation. Our implementation of constructive 
CIN can be generalized to other simple Prolog Systems with the attributed variables 
extension that is the only requirement for implementing disequality constraints. 
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As Ciao allows CLP over various constraint domains, one might wonder if our 
method is directly applicahle. One problem is that lemma 2 relies on a property of 
Herbrand constraints not enjoyed, in general, by other domains. Other than that, the 
only special requirement put by the transformation on the constraint domain is the abil- 
ity of computing, from an admissible constraint c a constraint -<3x.c. This is, strictly 
speaking, a bit stronger than the admissible closure requirement of [22], as this does not 
imply the existence of disjunctive constraints, but obtaining a domain with constructive 
disjunction from one satisfying admissible closure is almost trivial. See [12] for an ac- 
count of the necessity of admissible closure for constructive negation. The extension of 
the technique that implements the universal quantification to other constraint domains 
remains open. 

Another held of work is the optimization of the implementation of the /ora// using 
different search techniques and more specialized ways of generating the coverings of 
the Herbrand Universe of our negation subsystem. 
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Abstract. This paper describes how high level implementations of (need- 
ed) narrowing into Prolog can be improved by analysing definitional 
trees. First, we introduce a refined representation of definitional trees 
that handles properly the knowledge about the inductive positions of 
a pattern. The aim is to take advantage of the new representation of 
definitional trees to improve the aforementioned kind of implementation 
systems. Second, we introduce selective unfolding transformations, on 
determinate atom calls in the Prolog code, by examining the existence 
of what we call “deterministic (sub) branches” in a definitional tree. As 
a result of this analysis, we define some generic algorithms that allow 
us to compile a functional logic program into a set of Prolog clauses 
which increases determinism and incorporates some refinements that are 
obtained by ad hoc artifices in other similar implementations of func- 
tional logic languages. We also present and discuss the advantages of our 
proposals by means of some simple examples. 

Keywords: Functional logic programming, narrowing strategies, imple- 
mentation of functional logic languages, program transformation. 



1 Introduction 

Functional logic programming [12] aims to implement programming languages 
that integrate the best features of both functional programming and logic pro- 
gramming. Most of the approaches to the integration of functional and logic 
languages consider term rewriting systems as programs and some narrowing 
strategy as complete operational mechanism. Laziness is a valuable feature of 
functional logic languages, since it increases the expressive power of this kind 
of languages: it supports computations with infinite data structures and a mod- 
ular programming style. Among the different lazy narrowing strategies, needed 
narrowing [6] has been postulated optimal from several points of view. Needed 
narrowing addresses computations by means of some structures, namely defi- 
nitional trees [2], which contain all the information about the program rules. 

* Supported by CICYT TIC 2001-2705-C03-01, Accion Integrada Hispano-Italiana 
HI2000-0161, and Accion Integrada Hispano-Alemana HA2001-0059. 
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These structures allow us to select a position of the term which is being evalu- 
ated and this position points out to a reducible subterm that is “unavoidable” 
to reduce in order to obtain the result of the computation. It is accepted that 
the framework for declarative programing based on non-deterministic lazy func- 
tions of [18] also uses definitional trees as part of its computational mechanism. 
In recent years, a great effort has been done to provide the integrated languages 
with high level implementations of this computational model into Prolog (see 
for instance [5,7,13,16] and [19]). This paper investigates how an analysis of 
definitional trees can introduce improvements in the quality of the Prolog code 
generated by these implementation systems. 

The paper is organized as follows: Section 2 recalls some basic notions we use 
in the rest of the sections. In Section 3 we describe a refined representation of 
definitional trees and we give an algorithm for their construction in the style of 
[15]. Section 4 introduces two new translation techniques: Section 4.1 discusses 
how to take advantage of the new representation of definitional trees to improve 
(needed) narrowing implementations; Section 4.2 presents an algorithm, guided 
by the structure of a definitional tree, which is able to produce the same effect as 
if a determinate unfolding transformation was applied on the compiled Prolog 
code. Section 5 presents some experiments that show the effectiveness of our 
proposals. Section 6 discuses the relation of our techniques to other research on 
functional logic programming and logic programming. Finally, Section 7 contains 
our conclusions. 

2 Preliminaries 

We consider first order expressions or terms built from symbols of the set of 
variables X and the set of function symbols T in the usual way. The set of terms 
is denoted by T{T,X). We sometimes write f/n G T to denote that / is a 
n-ary function symbol. If t is a term different from a variable, TZoot(t) is the 
function symbol heading t, also called the root symbol of t. A term is linear if 
it does not contain multiple occurrences of the same variable. Var{o) is the set 
of variables occurring in the syntactic object o. We write ojf for the sequence of 
objects oi, . . . , On- 

A substitution cr is a mapping from the set of variables to the set of terms, 
with finite domain T>om{a) = {x ^ X \ a(x) ^ xj. We denote the identity 
substitution by id. We define the composition of two substitutions cr and 0, 
denoted cr o 0 as usual: cr o 9{x) = d-{6{x)), where a is the extension of a to the 
domain of the terms. A renaming is a substitution p such that there exists the 
inverse substitution p~^ and p o p~^ = p~^ o p = id. 

A term t is more general than s (or s is an instance of t), in symbols t < s, 
if (3cr) s = cr(t). Two terms t and f are variants if there exists a renaming p 
such that f = p{t). We say that t is strictly more general than s, denoted t < s, 
ii t < s and t and s are not variants. The quasi-order relation “<” on terms is 
often called subsumption order and “<” is called strict subsumption order. 

Positions of a term t (also called occurrences) are represented by sequences 
of natural numbers used to address subterms of t. The concatenation of the 
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sequences p and w is denoted by p.w. Two positions p and p' of t are comparable 
if (3w) p' = p.w or p = p' -w, otherwise are disjoint positions. Given a position 
p of t, t\p denotes the subterm of t at position p and t[s]p denotes the result of 
replacing the subterm t\p by the term s. Let ^ be a sequence of disjoint positions 
of a term t, . . . [sn]p„ denotes the result of simultaneously replacing each 

subterm t\p. by the term Si, with i G {1, . . . , n}. 



2.1 Term Rewriting Systems 

We limit the discussion to unconditional term rewriting systems^. A rewrite rule 
is a pair I r with l,r £ T{iF, X), I ^ X, and Var(r) C Var{l). The terms I and 
r are called the left-hand side (Ihs) and right-hand side (rhs) of the rewrite rule, 
respectively. A term rewriting system (TRS) 7?. is a finite set of rewrite rules. 

We are specially interested in TRSs whose associate signature IF can be 
partitioned into two disjoint sets = C l±LD where V = {TZoot{l) \ {I ^ r) £ TZ} 
and C = J-\T>. Symbols in C are called constructors and symbols in T> are called 
defined functions or operations. Terms built from symbols of the set of variables 
X and the set of constructors C are called constructor terms. A pattern is a term 
of the form f{dn) where f /n £ T> and dn are constructor terms. A term ffxZf), 
where Tjf are different variables, is called a generic pattern. A TRS is said to be 
constructor-based (CB) if the Ihs of its rules are patterns. For CB TRSs, a term 
t is a head normal form (hnf) if t is a variable or TZoot(t) £ C. 

A TRS is said to be left-linear if for each rule / — > r in the TRS, the Ihs I 
is a linear term. We say that a TRS is non-ambiguous or non-overlapping if it 
does not contain critical pairs (see [9] for a standard definition of a critical pair). 
Left-linear and non-ambiguous TRSs are called orthogonal TRSs. 

Inductively sequential TRSs are a proper subclass of CB orthogonal TRSs. 
The definition of this class of programs make use of the notion of definitional tree. 
For the sake of simplicity and because further complications are irrelevant for our 
study, in the following definition, we ignore the exempt nodes that appear in the 
original definition of [2] and also the or-nodes of [16] used in the implementation 
of Curry [15]. Note also, that or-nodes lead to parallel definitional trees and thus 
out of the class of inductively sequential systems. 

Definition 1. [Partial definitional tree] 

Given a CB TRS TZ, V is a partial definitional tree with pattern tt if and only 
if one of the following cases hold: 

1.1^ = rule{'K,l — !■ r), where t: is a pattern and I ^ r is a rewrite rule in TZ 
such that TT is a variant ofl. 

2. 'P = branch{TT,o,'Pk), where tt is a pattern, o is a variable position of tt 
(called inductive position^, Tyj are different constructors, for some k > 0, and 
for all i £ {1, . . . , k}, 'Pi is a partial definitional tree with pattern 7r[ci(^)]o, 
where n is the arity of Ci and Tjf are new variables. 

^ This is not a true limitation for the expressiveness of a programming language re- 
laying on this class of term rewriting systems [4]. 
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f(^,X2,X3) 




/( 


Xl,X2, 


^ 3 ) 


/K b,X3) 


■ ■ ■ ■ 

f{b,a, 

f{b, a, 


X 3 

fs 


f(c,b,X3) 

eterministic 

ub)branch 



b) Refined definitional tree. 



f{b,a,c) 

a) Standard definitional tree. 

Fig. 1 . Definitional trees for the function “/’’of Example 1 



From a declarative point of view, a partial definitional tree V can be seen as 
a set of linear patterns partially ordered by the strict subsumption order “<” 
[3]. Given a defined function f/n, a definitional tree of f is a, partial definitional 
tree whose pattern is a generic pattern and its leaves contain variants of all the 
rewrite rules defining /. 

Example 1 . Given the rules defining the function //3 

i?i : /(a, b, X)^ri, R2 : f{b, a, c) ^ T2, R3 : f(c, b, X) rg. 

a definitional tree of / is: 

branch{f {Xi, X2, X3) , 1, 

branch{f {a, X2, X3), 2, rule{f{a, b, X3), 

branch{f {b, X2, -^3), 2 , branch{f{b, a, Xs), 2 , rule{f{b, a, c), R2))), 
branch{f {c, X2, X3), 2, rule{f{c, b, X3), R3))) 

Note that there can be more than one definitional tree for a defined function. It is 
often convenient and simplifies understanding to provide a graphic representation 
of definitional trees, where each node is marked with a pattern and the inductive 
position in branches is surrounded by a box. Figure 1(a) illustrates this concept. 

Definition 2. [Inductively Sequential TRS] 

A defined function f is called inductively sequential if it has a definitional tree. 
A rewrite system TZ is called inductively sequential if all its defined functions are 
inductively sequential. 

In this paper we are mainly interested in inductively sequential TRSs (or proper 
subclasses of them) which are called programs. 

2.2 Definitional Trees and Narrowing Implementations into Prolog 

Most of the relevant implementations of functional logic languages, which use 
needed narrowing as operational mechanism, are based on the compilation of the 
programs written in these languages into Prolog [7,13,16,17]. These implemen- 
tation systems may be thought as a translation process that essentially consists 
in the following: 
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1. An algorithm to transform the program rules in a functional logic program 
into a set of definitional trees (See [16] and [15] for some of those algorithms). 

2. An algorithm that takes the definitional trees as an input parameter and 
visits their nodes, generating a Prolog clause for each visited node. Since 
definitional trees contain all the information about the original program as 
well as information to guide the (optimal) pattern matching process during 
the evaluation of expressions, the set of generated Prolog clauses is able to 
simulate the intended narrowing strategy being implemented. 



In the case of functional logic programs with a needed narrowing semantics, a 
generic algorithm for the translation of definitional trees into a set of clauses is 
given in [13]. When we apply that algorithm to the definitional tree of function 
/ in Example 1, we obtain the following set of Prolog clauses: 

"/, Clause for the root node: it exploits the first inductive position 
f(Xl, X2, X3, H) :- hnf(Xl, HXl) , f_l(HXl, X2. X3, H) . 

"/. Clauses for the remaining nodes: 

f_l(a, X2, X3, H):- hnf(X2, HX2) , f_l_a_2(HX2, X3, H) . 
f_l_a_2(b, X3, H):- hnf(rl, H) . 



f_l(b, X2. X3, H):- hnf(X2, 
f_l_b_2(a, X3, H):- hnf(X3, 
f_l_b_2_a_3(c, H):- hnf(r2, 



HX2), f_l_b_2(HX2, X3, H) . 
HX3), f_l_b_2_a_3(HX3, H) . 
H). 



f_l(c, X2, X3, H):- hnf(X2, HX2) , f_l_c_2(HX2, X3, H) . 
f_l_c_2(b, X3, H):- hnf(r3, H) . 

where hnf (T, H) is a predicate that is true when H is the hnf of a term T. For 
this example, the clauses defining the predicate hnf are: 

"/. Evaluation to head normal form (hnf) . 
hnf(T, T) :- var(T), !. 

hnf (f (XI, X2, X3), H) :- !, f(Xl, X2, X3, H) . 
hnf(T, T) . "/. otherwise the term T is a hnf; 

The meaning of these set of clauses is as follows. For evaluating a term t = 
ts) to a hnf, first, it is necessary to evaluate (to a hnf) the subterms of t 
at the inductive positions of the patterns in the definitional tree associated with 
/ (in the order dictated by that definitional tree — see Figure 1(a)). Hence, for 
our example: we compute the hnf of ti and then the hnf of t 2 ] if 6 is the hnf 
of ti and a is the hnf of t 2 , we have to compute the hnf of if the hnf of 
is c then the hnf of t will be the hnf of T 2 else the computation fails (see the 
sixth clause). On the other hand, if the hnf of ti is a or c it suffices to evaluate 
t 2 to a hnf, disregarding in order to obtain the final value. This evaluation 
mechanism conforms with the needed narrowing strategy of [6], as it has been 
formally demonstrated in [1]. 
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3 A Refined Representation of Definitional Trees 

As we have just seen, building definitional trees is the first step of the compi- 
lation process in high level implementations of needed narrowing into Prolog. 
Therefore, providing a suitable representation structure for the definitional trees 
associated with a functional logic program may be an important task in or- 
der to improve those systems. In this section we give a refined representation 
of definitional trees that saves memory allocation and is the basis for further 
improvements . 

It is noteworthy that the function / of Example 1 has two definitional trees: 
the one depicted in Figure 1 (a) and a second one obtained by exploiting po- 
sition 2 of the generic pattern f{Xi,X2,X3). Hence, this generic pattern has 
two inductive positions. As we are going to show, we can take advantage of this 
situation if we “simultaneously” exploit these two inductive positions. The main 
idea of the refinement is as follows: when a pattern has several inductive posi- 
tions, exploit them altogether. Therefore we need a criterion to detect inductive 
positions. This criterion exists and it is based on the concept of uniformly de- 
manded position, which was introduced into the functional logic setting by J. 
Moreno-Navarro, and M. Rodriguez- Artalejo et al. (see, for instance, [ 16 ]). 

Definition 3. [Uniformly demanded position] 

Given a pattern tt and a TRS TZ, Let he TZt^ = {I ^ r\(l ^ r) £ TZ f\ tt < . A 

variable position p of the pattern tt is said to be: (i) demanded by a Ihs I of a 
rule in TZtt if TZoot{l\p) G C. (ii) uniformly demanded by TZ-„ if p is demanded by 
all Ihs in TZ-„. 

We write U'DVos{tt) to denote the set of uniformly demanded positions of the 
pattern tt. The following proposition establishes a necessary condition for a po- 
sition of a pattern to be an inductive position. 

Proposition 1. Let TZ he an inductively sequential TRS and let tt he the pattern 
of a branch node of a definitional tree V of a function defined in TZ. Lf o is an 
inductive position of tt then o is uniformly demanded by TZji . 

The converse proposition is more involved but not difficult to establish. In the 
following, given two partial definitional trees Vi and V2, we say Vi ^ V2 if and 
only if Vi = V2 or Vi <'P2, where Vi < V2 if Pi is a proper subtree of V2- 

Proposition 2. Let TZ he an inductively sequential TRS. Let V a partial def- 
initional tree, with pattern tt, and o a variable position of tt. Lf o is uniformly 
demanded by TZt^ then there exists a partial definitional tree V ^ V , with pattern 
tt' , such that o is an inductive position of tt' . 

Hence, the concept of uniformly demanded position and Proposition 1 give 
us a syntactic criterion to detect if a variable position of a pattern is an inductive 
position or not and, therefore, a guideline to built a definitional tree: (i) Given 
a branch node, select a uniformly demanded position of its pattern; fix it as 
an inductive position of the branch node and generate the corresponding child 
nodes, (ii) If the node doesn’t have uniformly demanded positions then there are 
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two possibilities: the node is a leaf node, if it is a variant of a Ihs of the considered 
TRS, or it is a “failure” node, and it is impossible to build the definitional tree. 
The following algorithm, in the style of [15], uses this scheme to build a refined 
partial definitional tree rpdt(7r, 7?,,r) for a pattern tt and rules 7?.^ = {/ — *■ r | 
(/ — > r) G 7?. A 7T < Z}: 

1. If U'DVos{tt) = 0 and there is only one rule (Z — > r) G and a renaming p 
such that 7T = p(Z): 

rpdt{'K = rule{'K,p{l) p(?')); 

2. If UVVos{'k) 7 ^ 0 and for all (ci^, . . . , a^) G C,r, Vi = rpdt{ni,TZTT^) ^ fail: 

rpdt{TT, TZ tt) = branch{TT 

where oZn is the sequence of uniformly demanded positions in U'DVos{tt), 

Ctt — { (Oi ) I ^ D ) ^ T^tt a TZootili joi) — Ci,,A...A TZooti^li I ) — 
Oim ^ I ^ — 7r[cq )]oi • ■ ■ ^^id ; ■ ■ ■ J 

are new variables. 

3. Otherwise, rpcZ7(7T, 7?..^.) = fail. 

Given an inductively sequential TRS TZ and a n-ary defined function / in TZ, 
the definitional tree of / is rdt{f,TZ) = rpdt {no, where ttq = f{xZi)- Note 
that, for an algorithm like the one described in [15] the selection of the inductive 
positions of the pattern n is non-deterministic, iiU'DVos{n) ^ 0. Therefore, it is 
possible to build different definitional trees for an inductively sequential function, 
depending on the inductive position which is selected. On the contrary, our al- 
gorithm deterministically produces a single definitional tree for each inductively 
sequential function. Note also that it matches the more informal algorithm that 
appears in [15] when, for each branch node, there is only one inductive position. 

For the defined function / in Example 1, the last algorithm builds the fol- 
lowing definitional tree: 

branch{f{Xi, X 2 , X 3 ), (1,2), 
rule{f{a,b,X 3 ),Ri), 

branch{f{b, a, X 3 ), (3), rule{f{b, a, c), R 2 )), 
rule{f{c,b,X 3 ),R 3 ) 

which is depicted in Figure 1(b). As we said, for this example, the standard 
algorithm of [15] may build two definitional trees for / (depending on whether 
position 1 or position 2 is selected as the inductive position of the generic pattern 
f{Xi,X 2 ,X 3 )). Both of these trees have eight nodes, while the new representa- 
tion cuts the number of nodes of the definitional tree to five nodes. We claim 
that the new representation reduces the number of nodes in general and, also, 
the number of possible definitional trees associated with a function (actually, 
there is only one refined definitional tree for each defined function). 

As it has been proposed in [7], it is possible to obtain a simpler translation 
scheme of functional logic programs into Prolog if definitional trees are first 
compiled into case expressions. That is, functions are defined by only one rule 
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where the Ihs is a generic pattern and the rhs contains case expressions to specify 
the pattern matching of actual arguments. The use of case expressions doesn’t 
invalidate our argumentation. Thus, we can transform the last refined definitional 
tree in the following case expression: 

/(Xi,X2,X3) = case (Xi,X2) of 
(a, 6 ) ^ ri 

(&, a) ^ case (X3) of (c) — > T2 
(c, b) T3 

A case expression, like this, will be evaluated by reducing a tuple of arguments 
to their hnf and matching them with one of the patterns of the case expression. 

4 Improving Narrowing Implementations into Prolog 

This section discusses two improvements in the translation of non-strict func- 
tional logic programs into Prolog which are based on the analysis of definitional 
trees. These translation techniques can be applied jointly or separately. 

4.1 Translation Based on Refined Definitional Trees 

The refined representation of definitional trees introduced in Section 3 is very 
close to the standard representation of definitional trees, but it is enough to 
provide further improvements in the translation of functional logic programs 
into Prolog. 

It is easy to adapt the translation algorithm that appears in [13] to use our 
refined representation of definitional trees as input. If we apply this slightly 
different algorithm to the refined definitional tree of Figure 1(b), we obtain the 
following set of clauses, where the inductive positions 1 and 2 are exploited 
simultaneously: 

"/, Clause for the root node: 

f(Xl, X2, X3, H):- hnf (XI, HXl) , hnf(X2, HX2) . f_l_2(HXl, HX2, X3, H) . 

"/. Clauses for the remaining nodes: 
f_l_2(a, b, X3, H):- hnf(rl, H) . 

f_l_2(b. a, X3. H):- hnf(X3, HX3) , f _l_2_b_a(HX3 , H) . 
f_l_2_b_a(c. H):- hnf(r2, H) . 
f_l_2(c, b, X3. H):- hnf(r3, H) . 

where we have cut the number of clauses with regard to the standard represen- 
tation into Prolog (of the rules defining function /) presented in Section 2.2. The 
number of clauses is reduced in the same proportion as the number of nodes of 
the standard definitional tree for / were cut. As we are going to show in Sec- 
tion 5, this refined translation technique is able to improve the efficiency of the 
implementation system. 

On the other hand, it is important to note that the kind of improvements 
we are mainly studying in this subsection can not be obtained by an unfolding 
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transformation process applied to the set of clauses produced by the standard 
algorithm of [13]: In fact, it is not possible to obtain the previous set of clauses 
by an unfolding transformation of the set of clauses shown in Section 2.2. 



4.2 Selective Unfolding Transformations 

The analysis of definitional trees provides further opportunities for improving 
the translation of inductively sequential programs into Prolog. For instance, we 
can take notice that the definitional tree of function / in Example 1 has a 
“deterministic” (sub)branch, that is, a (sub)branch whose nodes have only one 
child (see Figure 1(b)). This knowledge can be used as a heuristic guide for 
applying determinate unfolding transformation steps selectively. 

Note that, for the example we are considering, the clauses: 

f_l_2(b, a, X3, H):- hnf(X3, HX3) , f _l_2_b_a(HX3 , H) . TL (Cl) 

f_l_2_b_a(c, H):- hnf(r2, H) . TL (C2) 

can be merged into: 

f_l_2(b, a, X3, H):- hnf(X3, c) , hnf(r2, H) . TL (C’) 

by applying a safe unfolding transformation in the style of Tamaki and Sato [21] 
but restricting ourselves to determinate atoms [10] (i.e., an atom that matches 
exactly one clause head in the Prolog code): we get clause Cl (the unfolded 
clause) and we select the atom f_l_2_b_a(HX3, H) in its body; this atom call is 
unifiable with the head of clause C2 (the unique unfolding clause for this atom 
call), with most general unifier a = {HX3/c} (actually, a matcher); Therefore, 
we can perform a transformation step where Cl and C2 are instantiated applying 
(T, the atom call is unfolded and, afterwards, clauses Cl and C2 are replaced by 
CL 

This selective unfolding is preferable to a generalized (post-compilation) un- 
folding transformation process^ which may degrade the efficiency of the compiled 
Prolog code. Moreover, this selective unfolding transformation can be easily inte- 
grated inside the compilation procedure described in [13]. It suffices to introduce 
an additional case in order to treat deterministic (sub)branches: 



Trans{branch{TT,o,T^),p) := 

if = branch{TTn,Om'T') 
produceCode : 

fpidl: ■ • ■ : ('mi H) • “ 

hnf (X, 7Ti|o), hnf (ttiIoj, 7T2|oi), ■ ■ hnf (7r„_i|o„_j, 7 t„|o„_j), 

hnf (7r„|o„, Y), /pu{ }((ii---iC,H). 

Trans{T',pU {o, oi, . . . , o„}); 



2 



That is, a transformation process where non determinate atom calls are unfolded 
too. 
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else if T" = ru/e(7T„, 7T„ — > r) 
produceCode : 



hnf (X, 7Ti 0 ), hnf (tti 0 ,, 7T2 oi), ■ • 


•; ('^n— 1 1 On — 1 J ^n|on— 1^7 


hnf (r, H). 





where tt = f{ti , . . . , tm), t^\o = • • ■ , is the sequence of nodes 

in the deterministic (sub)branch with T® = branch{TTi,Oi,T'^'^^), and 
7Tn[Y]o„ = 



Trans(Tn,p) := Trans{Ti,p ), . . . , Trans{%i,p); 

Now, each function / is translated by Trans{T, 0), where T is a definitional tree 
of /. 

Roughly speaking, the new case in the algorithm of [13] can be understood as 
follows. If there exists a deterministic (sub)branch visit its nodes in descending 
order forcing that the evaluation (to hnf) of the subterms at the inductive posi- 
tion o of a term be the fiat constructor at position o of the child node. Proceed 
in this way until: i) a non deterministic node is reached; or ii) a leaf node is 
reached and, in this case, evaluate the rhs of the rule to its hnf and stop the 
translation. 

The last algorithm allows some improvements we have omitted for the sake 
of simplicity. First, it is possible to eliminate redundant arguments. Second, it 
is possible to exploit rule nodes (i.e., an atom call like hnf (r, H)) to perform 
an additional determinate unfolding step^. Having all this in consideration, the 
following example illustrates the algorithm. 

Example 2. Given the rules defining the partial function even and its definitional 
tree: 

branch{even{Xi), 1, 

i?i : even{0) true, rule{even{0), Ri), 

i ?2 : even{s{s{X)) even{X). branch(even{s{X 2 )),l.l, 

rule{even{s{s{X 3 )), R 2 )))) 

the Prolog code generated by the T rans algorithm is: 

"/, Evaluation to head normal form (hnf). 
hnf (even(Xl) , H) !, even(Xl, H) . 

"/. Clause for the root node: it exploits the first inductive position 
even(Xl, H) hnf (XI, HXl) , even_l(HXl, H) . 
even_l(0, true). 

"/, Clause for the deterministic (sub)branch: 
even_l(s(X2) , H) hnf(X2, s(X3)), even(X3, H) . 

Note as the determinate call hnf (even(X3) , H) has been unfolded (into the call 
even(X3, H) using the first rule for evaluating a hnf). 

® These improvements are implemented in the curry2prolog compiler of Pakcs [8] 
for the standard cases. 
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Therefore, our Trans algorithm, guided by the structure of a definitional tree, 
is able to reproduce the effect of a post-compilation unfolding transformation 
when it is applied selectively on determinate atom calls in the standard compiled 
Prolog code. 

5 Experiments 

We have made some experiments to verify the effectiveness of our proposals. We 
have instrumented the Prolog code obtained by the compilation of simple Curry 
programs by using the curry2prolog compiler of Pakcs [8] (an implementation 
of the multi-paradigm declarative language Curry [15]). We have introduced 
our translation techniques in the remainder Prolog code. For our first transla- 
tion technique, the one using the refined representation of definitional trees, the 
results of the experiments are shown in Table 1 . Runtime and memory occupa- 
tion were measured on a Sun4 Sparc machine, running Sicstus v3.8 under SunOS 
v5.7. The “Speedup” column indicates the percentage of execution time saved 
by our translation technique. The values shown on that column are the percent- 
age of the quantity computed by the formula {ti — where ti and O are 

the average runtimes, for several executions, of the proposed terms (goals) and 
Prolog programs obtained when we don’t use (ti) and we use (t 2 ) our translation 
technique. The “G. stack Imp.” column reports the improvement of memory 
occupation for the computation. We have measured the percentage of global 
stack allocation. The amount of memory allocation measured between each ex- 
ecution remains constant. Most of the benchmark programs are extracted from 



Table 1. Runtime speed up and memory usage improvements for some bench- 
mark programs and terms. 



Benchmark 


Term 


Speedup 


G. stack Imp. 


family 


grandf other _) 


19.9% 


0% 


geq 


geg(100000, 99999) 


4.6% 


16.2% 


geq 


geg(99999, 100000) 


4.3% 


16.2% 


xor 


xor{^, _) 


18.5% 


0% 


zip 


zip{Ll, L2) 


3.6% 


5.5% 


zip3 


zip^Ll, L2, L2) 


4.5% 


10% 




Average 


9.2% 


7.9% 



[15] and the standard prelude for Curry programs with slight modifications^. 
For the benchmark programs famiily and xor we evaluate all outcomes. The 
natural numbers are implemented in Peano notation, using zero and succ as 

^ For example, zip (resp. zip3) is adapted for combining two (resp. three) lists of 
elements of equal length into one list of pairs (resp. triples) of the corresponding 
elements. However, this function also may be useful in a practical context (see [14], 
page 280). 
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constructors of the sort. In the zip and zip3 programs the input terms LI and 
L2 are lists of length 9. 

Regarding the second translation technique, the one which implements se- 
lective unfolding transformations, for the benchmark program of Example 2 we 
obtain an average speedup of 11.7% and an improvement in memory usage of 
14.7%. 

More detailed information about the experiments and benchmark programs 
can be found in http://www.inf-cr.uclm.es/www/pjulian/publications.html. 



6 Discussion and Related Work 

In this section we discuss some important issues and we put them in relation to 
other research on functional logic programming and logic programming when it 
is convenient. 

Elimination of ad hoc artifices. It is noteworthy that, in some cases, the 
benefits of our first translation scheme are obtained in an ad hoc way in actual 
needed narrowing into Prolog implementation systems. For instance, the stan- 
dard definition of the strict equality used in non-strict functional logic languages 
is [11,19]: 



c == c ^ true 

ciX^) == c(K) ^ == Fi&& . . . == 

where c is a constructor of arity 0 in the first rule and arity n > 0 in the second 
rule. There is one of these rules for each constructor that appears in the program 
we are considering. Clearly, the strict equality has an associate definitional tree 
whose pattern {X\ == X 2 ) has two uniformly demanded positions (positions 
1 and 2) and, therefore, it can be translated using our first technique, that 
produces a set of Prolog clauses similar to the one obtained by the curry2prolog 
compiler. Thus, the curry2prolog compiler produces an optimal representation 
of the strict equality which is treated as a special system function with an ad 
hoc predefined translation into Prolog, instead of using the standard translation 
algorithm which is applied for the translation of user defined functions. 

Failing derivations. Our first contribution, as well as the overall theory of 
needed evaluation, is interesting for computations that succeed. However it is 
important to say that some problems may arise when a computation does not 
terminate or fails. For example, given the (partial) function {/(a, a) — > a} the 
standard compilation into Prolog is: 

f(A,B,C) :-hnf(A,F). f_l(F,B,C). 
f_l(a,A,B) :-hnf(A,E). f _l_a_2(E,B) . 
f_l_a_2(a,a) . 

while our first translation technique produces: 

f(A,B,C) :-hnf(A,F), hnf(B,G), f_l(F,G,C). 
f_l(a,a,a) . 
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Now, if we want to compute the term f (b , expensive_term) , the standard im- 
plementation detects the failure after the computation of the first argument. On 
the other hand, the new implementation computes the expensive term (to hnf) 
for nothing. Of course, the standard implementation has problems too — e.g. if 
we compute the term f (expens ive_term, b), it also computes the expensive 
term (to hnf) — , but it may have a better behavior on this problem. Thus, in 
a sequential implementation, the performance of our first translation technique 
may be in danger when subterms, at uniformly demanded positions, are evalu- 
ated (to hnf) jointly with an other subterm whose evaluation (to hnf) produces 
a failure. An alternative to overcome this practical disadvantage is to evaluate 
these subterms in parallel, introducing monitoring techniques able to detect the 
failure as soon as possible and then to stop the streams of the computation. 

Clause indexing and direct implementation into Prolog. Clause index- 
ing is a technique, used in the implementation of Prolog compilers, that aims to 
reduce the number of clauses on which unification with a goal is performed. In 
general, indexing techniques are based on the inspection of the outermost func- 
tion symbol of one or more arguments in a clause head. If the predicate symbol 
and the respective indexed symbols of the clause head and the goal coincide, then 
the clause is selected as part of the filtered set. Afterwards, the set of clauses in 
the filtered set (presumably smaller than the original one) is attempted to unify 
with the goal. More sophisticated indexing techniques such as those described 
in [20] perform indexing on all non variable symbols of a clause head (loosing 
no significant structural information). Also, these techniques are able to obtain 
the unifier during the indexing process. Although it seems to have some sim- 
ilarities between indexing techniques and the standard operational mechanism 
of functional logic languages, there is a big difference: in the context of pure 
logic languages terms are dead structures. However, in the context of this work, 
the concept of evaluation strategy relies on the existence and manipulation of 
nested alive terms. The needed narrowing strategy, as defined in [6], is an ap- 
plication from terms and partial definitional trees to sets of triples (position, 
rule, substitution), where each triple gives the position of a term, the rule of 
the program and the unifier substitution (not necessarily a most general one) 
used in a narrowing step. Our work is concerned in the optimization of certain 
implementation techniques of needed narrowing into Prolog. 

On the other hand, a direct representation of a function into Prolog is pos- 
sible, which is often more efficient, since term structures with nested functions 
calls are not generated. However, a direct implementation corresponds to a call- 
by-value strategy, that lacks some valuable properties (as the ability of handle 
infinite data structures or a good termination behavior) [13]. 

Determinate unfolding. Determinate unfolding [10] has been proposed as 
a way to ensure that the specialization of a logic program will never duplicate 
computations. The advantages of determinate unfolding transformations, in the 
context of the implementation of functional logic languages into Prolog, were 
suggested in [13] and [7]. They proposed to apply determinate unfolding as a 
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post-compilation process but actually, in the curry2prolog compiler, determi- 
nate unfolding steps are only applied to unfold the atom calls produced by rule 
nodes. The novel of our proposal is that it exploits all opportunities for deter- 
minate unfolding in a systematic way and it is embedded inside the compilation 
process. 

7 Conclusions 

In this paper we have introduced a refined representation of definitional trees that 
eliminates the indeterminism in the selection of definitional trees in the context 
of the needed narrowing strategy (there is only one refined definitional tree for 
each inductively sequential function) . We have defined two translation techniques 
based on the analysis of (refined) definitional trees. Although the results of the 
experiments section reveal a good behavior of these translation techniques, it is 
difficult to evaluate which may be their impact over the whole system, since the 
improvements appear when we can detect patterns that have several uniformly 
demanded positions or the existence of deterministic (sub)branches in a (refined) 
definitional tree. Nevertheless, our work shows that there is a potential for the 
improvement of actual (needed) narrowing implementation systems: we obtain 
valuable improvements of execution time and memory allocation when our trans- 
lation techniques are relevant. For the case of inductively sequential functions 
without the features aforementioned, our translation schemes are conservative 
and don’t produce runtime speedups or memory allocation improvements. Al- 
though failing derivations are rather a problematic case where the performance of 
our first translation technique may be in danger, we can deal with these problem 
by introducing concurrent computations, in order to guarantee that slowdowns, 
with regard to standard implementations of needed narrowing into Prolog, are 
not produced. Hence, the occurrence of several inductive position in a pattern 
can be considered as a signal for exploiting implicit parallelism. 

On the other hand, our simple translation techniques are able to eliminate 
some ad hoc artifices in actual implementations of (needed) narrowing into Pro- 
log, providing a systematic and efficient translation mechanism. Moreover, the 
ideas we have just developed can be introduced with a modest programming 
effort in standard implementations of needed narrowing into Prolog (such as the 
Pakcs [8] implementation of Curry) and in other implementations based on the 
use of definitional trees (e.g., the implementation of the functional logic language 
TOy\VI]), since they don’t modify their basic structures. 
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Abstract. We present a graphical tool for the declarative debugging of 
wrong answers in functional-logic languages. The tool, integrated in the 
system TOy, can be used to navigate a computation tree corresponding 
to some erroneous computation. During the navigation the users can ei- 
ther follow a fixed strategy or move freely providing information about 
the validity of the nodes as they like. We show by means of examples 
how this flexibility can reduce both the number and the complexity of 
the questions that the user must consider w.r.t. the usual top-down navi- 
gation strategy. Moreover, the tool includes some extra features that can 
be used to automatically simplify the computation trees. 



1 Introduction 

The idea of declarative debugging was first proposed by E.Y. Shapiro [18] in the 
field of Logic Programming, and has been developed not only in this paradigm 
[11,7] but also in constraint- logic programming [20], functional programming 
[15,14,17], and functional-logic programming [5,4,6]. 

The overall idea in all the cases is the same, as pointed out by Lee Naish in [13]. 
The debugging starts when some erroneous computation, the so-called initial 
symptom, is found, and can be described as a two stages process: 

- First, the declarative debugger builds a suitable computation tree for the initial 
symptom. Each node in this tree keeps the result of some subcomputation and 
can be associated to the fragment of code responsible for it. In particular, the 
root represents the (wrong) result of the main computation. The children of a 
given node must correspond to the intermediate subcomputations needed for 
obtaining the result at such node. This phase is automatically performed by the 
debugger; the user’s assistance is only required to detect the initial symptom. 

- Once the tree is obtained it is navigated looking for a buggy node, i.e. a node 
with an erroneous result whose children nodes have correct results. Such node 
is associated to a fragment of code that has produced an erroneous output from 
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correct inputs, and will be pointed out by the debugger as one bug in the pro- 
gram. In order to check whether the nodes are correct or not, some external 
oracle, generally the user, is needed. 

One of the main criticisms about the use of this technique w.r.t. other debugging 
methods such as abstract diagnosis [2,1], is the amount and complexity of the 
questions that the user must answer during the navigation phase. The problem 
has been considered in (constraint) logic programming [18,20], but has received 
little attention in functional and functional-logic programming, where most of 
the works are devoted to the definition of suitable computation trees and to de- 
vise mechanisms for their implementation. The declarative debuggers proposed 
in functional and functional-logic programs, to the best of our knowledge, per- 
form a top-down traversal of the tree during the navigation phase. As we will 
see this is only satisfactory for trees containing a buggy node near the root; 
otherwise the number and complexity of the questions can make the debugging 
process unrealistic. 

In this paper we WT (an acronym for declarative debugging Tool), a graph- 
ical declarative debugger included as part of the lazy functional-logic system 
Toy [12]. However, the ideas and techniques presented here are also valid for 
the declarative debugging of wrong answers in other lazy functional-logic lan- 
guages such as Curry [10] or lazy functional languages as Haskell [16]. 

ddT allows the user either to navigate freely the computation tree or to select 
one of the default strategies provided by the tool to guide the navigation. In the 
paper we show how these possibilities can be used to reduce both the number and 
the complexity of the questions that the user must consider during the debugging 
process. Moreover, ddT also incorporates two techniques for simplifying the 
computation tree, the first one based on the notion of entailment proposed in [6] , 
and the second one based on the use of another program as a (generally partial) 
correct specification of the intended program semantics. These two features can 
be used to determine the validity of some nodes of the computation tree in 
advance, thus simplifying the role of the user during the navigation phase. 

The structure of the paper is as follows: next Section introduces some preliminary 
concepts and presents the general aspect of the tool. Section 3 explains by means 
of an example how the flexibility of the navigation in ddT can be used to 
detect buggy nodes more easily. Section 4 discusses the strategies provided by 
the system, while Section 5 presents the two techniques used to simplify the 
computation tree mentioned above. Finally Section 6 concludes and points to 
some planned future work. 

ddT is part of the distribution of the TOy system, which is available at 
http://titan.sip.ucm.es. 



2 Initial Concepts 

As we said in the previous section, ddT is integrated in the lazy FLP lan- 
guage Toy [12]. In this section we first recall some basics about the language 
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and illustrate them with an example. Then some basic notions and properties 
regarding computation trees are presented. 



2.1 The ^ Oy Language 

Programs in T Oy can include data type declarations, type alias, infix operators 
declarations, function type declarations, and defining rules for functions symbols. 
Before describing the structure of the defining rules we must define some initial 
notions such as expressions and patterns. A more detailed description of the 
syntax of the language can be found in [12]. 

The syntax of partial expressions e € Exp± is e ::= T \ X \ h \ (e e') where X 
is a variable and h either a function symbol or a data constructor. Expressions 
of the form (e e') stand for the application of expression e (acting as a function) 
to expression e' (acting as an argument). In the rest of the paper the notation 
e Cl 62 . . . e„ (or even e e„) is used as a shorthand for (. . . ((e ei) 62 ) . . .)e„). 
Similarly, the syntax of partial patterns t € Pat± C Exp± can be defined as 
t ::= T \ X \ c ti ... tm \ f ti ... tm where X represents a variable, c a data 
constructor of arity greater or equal to m, and / a function symbol of arity 
greater than m, while the ti are partial patterns for all 1 < f < m. We define 
the approximation ordering C as the least partial ordering over Pat± satisfying 
the two following properties: 

- T C t, for all t € Patj_. 

- htm Q h Sm if h trm h Sm € Pat± and ti Q Si for all 1 < i < m. 

Expressions and patterns without any occurrence of T are called total. 

The defining rules for a function / are composed of a left-hand side, a right-hand 
side, an optional condition, and some optional local definitions: 

(R) f ti . . .tn = ^ C where LD 

left-hand side right-hand side condition local definitions 

the condition has the form C = ei == e'i,...,ek == e)., while the local 
definitions are LD = {si <— ai; . . . ; Sm ^ Qm}, where e^, e', and r are total 
expressions, the tj , Sj are total patterns with no variable occurring more than 
once in different tk,ti or in different Sk,si, and no variable in Si occurring in 
Uj for 1 < j < i < m. Roughly, the intended meaning of a program rule like 
(i?) is that a call to the function / can be reduced to r whenever the actual 
parameters match the patterns ti, using the local definitions LD, and ensuring 
that the conditions == e' are satisfied. A condition e == e' is satisfied by 
evaluating e and e' to some common total pattern. 

A formal semantic calculus for TOy programs is described in [8,9], and has 
been adapted to the declarative debugging of wrong answers in [5,6]. As we 
proved in [5,6] a simplification of the proof trees in this semantic calculus can be 
employed as suitable computation trees for the declarative debugging of wrong 
answers in lazy functional-logic languages. The nodes of such computation trees 
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fib 

fibAux N M 


= [1, 1 1 fibAux 1 1] 

= [N+M 1 fibAux N (N+M)] 


goldenApprox 


= (tail fib) ./. fib 


infixr 20 ./. 

[X 1 Xs] ./. [Y 


Ys] = [ X/Y 1 Xs ./. Ys] 


tail [XjXs] 


= Xs 


take 0 L 
take N [X|Xs] 


= [] 

= [X| take (N-1) Xs] <== N>0 


main R 


= true <== take 5 goldenApprox == R 



Fig. 1. Approximating the Golden Ratio 



are always basic facts of the form / > t, with / a function symbol of 

arity n and fi, . . .tn G Patj_. The idea is that a basic fact f ti . . .tn t can 
be proved w.r.t. some program P iff the pattern t approximates the value of the 
function call (/ t\ . . .tn) in P. Since the value _L approximates all the values 
in our semantics, trivial basic facts of the form / ti . . . — s- _L always can be 

proved. In [5,6] we also proved that any buggy node in these computation trees 
its associated with some erroneous funetion rule of the program, in fact with the 
function rule used to prove the basic fact labelling the node. The intended model 
of a program P is the set I of basic facts that the user expects to be provable 
w.r.t. P. A node of the computation tree is correct if its basic fact belongs to I, 
and incorrect otherwise. Correct nodes are also called valid w.r.t. I. 

We will assume that every program P includes a special program rule main of the 
form main X I . . . Xk = true <== C , where {Ai, ..., A^}, > 0, is the set 

of variables occurring in the condition C . The system will compute substitutions 
(T, called answers, of patterns for variables such that dom{a) C { Xi, . . . , Xj,}, 
meaning that the basic fact (main X \ . . . Xk)cr —>• true can be proved w.r.t. 
P. Notice that this notion of goal, suitable for this work, is compatible with 
actual goals in T Oy which are of the form: ei == , . . . , == e'f., simply by 

assuming that the goal is the condition of an implicit program rule for main. 

2.2 An Example 

Figure 1 shows a T Oy program whose purpose is to approximate the number 
known as the golden ratio, by using the Fibonacci sequence 1, 1, 2, 3,5,.. ., 
where each term in the sequence (after the second) is the sum of the two that 
immediately precede it. If we call fib(i) to the z-th term of this sequence, the 
following property holds: 
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fib{n-\-l) 1+^5 

iim — — = 

n— >oo jib[n) 2 

The program contains a function fib that represents an infinite list containing the 
Fibonacci sequence. This function uses an auxiliary function fibAux. Given two 
integer values Xq and Xi, the function call fibAux Xq Xi is expected to compute 
the infinite list [X 2 , X 3 , ...], such that X^ = Xfe _2 + Xfc_i, for any k > 2. 
Function goldenApprox computes the infinite list [fib(2)/fib(l), fib(3)/fib(2), . . . ] 
using the infix operator that returns the result of dividing two infinite lists 
term by term. The meaning of the rest of the functions should be clear from the 
context. Some basic facts included in the intended model I of the program are: 

T = { . . . ,fib ^ T, . . . ,fib [1 IT], .... fib ^ [1.1, 2, 3. 5, 8 | T], .... 

.... fibAux 1 1 -> [2, 3. 5. i T], . . . . fibAux 10 20 -> [30.50.80,120 ] T], .... 

.... main [1. 2. 1.5. 1.66, 1.6] ^ true, . . . } 

among others. In particular, the basic fact main [1,2,1.5,1.66,1.6] — > true is 
expected since [1/1, 2/1, 3/2, 5/3, 8/5] = [1, 2, 1.5, 1.66, 1.6] (rounding to two 
decimals for simplicity) is the list of the fifth first approximations to the golden 
ratio using the Fibonacci sequence. However, the system computes the answer 
cr = {R I— !■ [1, 2, 1.5, 1.33, 1.25 ]}. This is a wrong answer, since main Rcr ^ true is 
not in the intended model of the program, and constitutes the initial symptom 
showing that there is some bug in the program. 

The computation tree for this wrong computation can be seen in Figure 2, as 
displayed by VDT . The root of the tree corresponds to the initial symptom and 
the children of each node correspond to the function calls needed for computing 
the basic fact at the node. For instance the root has two children: 

(1) goldenApprox ^ [1, 2, 1.5, 1.33, 1.25|_ ] 

(2) take 5 [1, 2, 1.5, 1.33, 1.25|_ ] ^ [1, 2, 1.5, 1.33, 1.25 ] 

corresponding to the two function calls in the condition of main instantiated 
with the values used during the computation. The character _ in the display 
represents the symbol T, and stands in place of some value whose evaluation was 
not needed during the computation. The basic fact (2) is valid in the intended 
model, but the basic fact (1) is not, since the fourth and fifth members of the list 
at the right-hand side of the basic fact should be = 1.66 and = 1.6 

respectively. 

In the debugging session of the figure we have provided information about the 
validity of all the nodes, although usually this is not necessary as we will see 
in sections 3 and 4. At the bottom of the display VDT shows data about the 
amount of different kinds of nodes, including unknown nodes, corresponding 
to basic facts whose validity has not yet determined, and trusted nodes, which 
correspond to basic facts associated to trusted functions. In this example the 
user has decided that functions take and tail are trusted and hence all the basic 
facts corresponding to calls of these functions will be considered valid by the 
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Fig. 2. Computation Tree for the program of Figure 1 



Open 
Reload 
Save 
Save As 



0 ^ Valid 
n Non-Valid 

□ 7 Unknown 

□ Trusted 

□ y*" Buggy 




Expand Tree 
Collapse Tree 



Remove Trusted Nodes 
Remove Valid ^Trusted Nodes 
Remove Selected Node Subtree 
Make Selected Node New Root 



jSelectj 


Strategy 1 


Deepest Non-Valid Node 


10 No Strategy 


Deepest Desc with the Same Rule 


n Top-Down 


Center of the Tree 


□ Divide & Query 



Fig. 3. Some menu options in TyDT 
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debugger. The computation tree has two buggy nodes (one appears selected in 
the figure), both of them corresponding to the application of the single function 
rule for fibAux, which is therefore a wrong rule and will be pointed out by the 
debugger as the cause of the bug. The bug in this rule is in the first argument 
of the recursive call, that should be M instead of N. The correct rule is then: 
fibAux N M = [N + M | fibAux M (N+M)]. 

2.3 Computation Trees 

Next we present some definitions and auxiliary notation about computation trees 
(shortly CT’s). Notice that although some basic facts can occur repeatedly in 
a CT, any node can be identified by its path to the root. Given a CT T and a 
node N G T, we will use the notation root{T) to represent the root of the tree, 
subtree{T, N) for the subtree of T whose root is N, and children{T, N) to repre- 
sent a list with the children nodes of N in T. If N is valid w.r.t. the intended in- 
terpretation I we will write valid{N), assuming that T is clear from the context. 
Analogously nonvalid(N) will represent a non-valid node, while buggy{T, N) 
will mean that nonvalid(N) and valid{N') for all N' G children{T,N). Finally, 
buggy{T) will indicate that there exists a node N gT such that buggy{T, N). 

The number of nodes in a computation tree T will be represented as |T|. Let 
N he a node in T such that N yf root{T), and let P be the parent of N 
in T. Then the notation T — N represents the new tree obtained by remov- 
ing N from T and letting the children of N become children of P. Hence, if 
A^i, . . . , Ni-i,N, Ni+i, . . . Nm are the children of P in T and are 

the children of N in T, the children of P in T — TV will be TVi, . . . , TVi_i, TV{, 
. . . , TV^, TVi+i . . . Nm- With these definitions two interesting properties of com- 
putation trees can be proved. Given a computation tree T : 

PI If TV € T and nonvalid{N) , then there is some node TV' G subtree{T, TV) such 
that buggy{T, TV') and the path from TV to TV' only has non-valid nodes. 

P2 if TV G T and valid{N), then buggy(T) iff buggy(T — N), and for every 
TV' G P — TV such that buggy{T — TV, TV'), buggy{T, TV') holds. 

At several places in the rest of the paper we will use these properties for 
justifying the correctness of various VT>T features. 

3 Free Navigation 

The TOy system includes currently two declarative debugging navigators: a 
textual top-down navigator similar to those of Buddha [17] and Freja [14], and 
the graphical navigator VT>T . This section shows by means of an example how 
the flexibility allowed by 'D'D'T can reduce the number and complexity of the 
nodes that the user must examine in comparison to the top-down navigators. 
Let us consider again the program of Figure 1, but replacing the rule of the 
function main by main R = true <==; take 15 goldenApprox == R . The answer 
computed by the system is again wrong and therefore the declarative debugger 
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can be employed. By using the top-down navigator of TOy we can obtain the 
following debugging session, where we have replaced part of the lists by dots for 
the sake of saving space in this presentation. 

Consider the following facts: 

1: goldenApprox — > [1, 2, 1.5, 1.33, 1.25, 1.2, ... | _] 

2: take 15 [1, 2. 1.5, 1.33, 1.25, 1.2, ... | _] -> [1, 2. 1.5, 1.33, 1.25, 1.2, ...] 

Are all of them valid? ([y]es / [n]ot) / [ajbort) n 

Enter the number of a non-valid fact followed by a fullstop: 1. 

Consider the following facts: 

1: fib [1, 1, 2, 3. 4, 5, 6. 7, 8, 9, 10, 11, 12, 13, 14, 15 | _ ] 

2: tail [1, 1, 2. 3. 4, 5. ... | . ] -> [1, 2, 3, 4. 5, ... j . ] 

3: fib -> [1, 1, 2. 3. 4, 5. 6. 7, 8. 9, 10, 11, 12, 13, 14 | _ ] 

4: [1, 2. 3, 4, ... I _] ./. [1, 1. 2, 3, ... | . ] ^ [1, 2. 1.5, 1.33, 1.25, ... | . ] 

Are all of them valid? ([y]es / [n]ot) / [ajbort) n 

Enter the number of a non-valid fact followed by a fullstop: 1. 

Consider the following facts: 

1: fibAux 1 1 ^ [2. 3, 4, 5. 6, 7, 8. 9, 10, 11. 12. 13, 14, 15 | _ ] 

Are all of them valid? (Jy]es / [n]ot) / [ajbort) n 

Consider the following facts: 

1: fibAux 1 2 ^ [3. 4, 5, 6. 7, 8, 9. 10, 11, 12, 13. 14. 15 | _ ] 

(12 similar questions more involving fibAux ) 

Rule number 1 of the function fibAux is wrong. 

We have omitted 12 additional questions about the validity of basic facts involv- 
ing fibAux. Even assuming that the user knows the answer to these questions, 
the process can be a bit boring. With greater values in the argument of take the 
use of the top-down textual navigator will become unrealistic. It could be argued 
that the user should stop after a few questions about fibAux and try some easier 
goal, but in general is not always feasible to replace the goal which has produced 
an error symptom by a simpler one. 

Let us examine now a possible debugging session for the same program using 
WT . The display is not completely shown in some of the images due to the 
lack of space. In the initial state of the debugger only the first level of the tree, 
with the two children of the root, is expanded: 




The user observes that the node corresponding to goldenApprox is not valid, 
changes its state to nonvalid and expands the node to examine its children. 
The state of a node can be changed in 'D'D'T either by using the option menu 
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’’Nodes” (see Figure 3), or by right-clicking over the node and selecting the state 
from the menu that appears, as shows the next image: 




The next image shows the debugging session after descending in this way four 
levels in the tree: 




At this point the user, maybe getting bored, can check that the next nodes in the 
same path correspond to the function fibAux, and seem to be non-valid. Then 
the option ’’Deepest Descendant with the Same Rule” of the menu ’’Select” can 
be used. This option looks automatically for the deepest use of fibAux in the 
subtree whose root is selected. In this example, the system finds out and selects 
the node containing the basic fact fibAux 1 14 [15 | _]: 




The user can check readily that this node is valid and change its state conse- 
quently. Moving now bottom-up, the user detects that the parent of the selected 
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node contains the non- valid basic fact fibAux 1 13 ^ [14,15 | _] (the second el- 
ement of the list should be 14-1-13 = 27 instead of 15) and changes its state 
accordingly. In this moment WT detects that this node is buggy and shows an 
message reporting to the user that the only program rule for fibAux, used at the 
buggy node, has been detected as incorrect. 

Due to the recursive structure of function definitions, the option ’’Deepest De- 
scendant with the Same Rule” will often render a new current node N whose 
basic fact is smaller and thus simpler to analyze. Navigation can then proceed 
by moving down to N's children in case that N is non-valid, and moving up 
to N's ancestors otherwise. In both cases, property PI guarantees that a buggy 
node can be eventually found, because N is known to have an invalid ancestor. 

Of course the textual debugger could be enhanced with more sophisticated op- 
tions, similar to those of WT mentioned here. However we think that the graph- 
ical displaying provides a more general perspective of the CTs that allows, in 
many cases, a quicker detection of the buggy nodes. 



4 Strategy-Guided Navigation: Top-Down Versus 
Divide- and- Query 

Although free navigation can be used to reduce the number of nodes consid- 
ered during the debugging process, WT also includes the possibility of using a 
strategy-guided navigation and provides two possibilities: the top-down and the 
divide- and- query strategies. 

The top-down strategy behaves essentially like the textual debugger presented 
in the previous section. The process starts with a computation tree whose root 
is considered non-valid. Then the children of the root are examined looking for 
some non-valid child. If such child is found the debugging continues examining 
its corresponding subtree. Otherwise all the children are valid and the root of 
the tree is pointed out as buggy, finishing the debugging. 

The next display shows the starting point of the a debugging session using the 
top-down strategy, where the user has marked the first node as non-valid and 
the second one as trusted: 
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13 (13) 
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Fig. 4. Number of steps and nodes examined with the two strategies 



Notice that after each subsequent step the selected subtree has a smaller size 
and an invalid root. Hence, as a consequence of property PI, a buggy node is 
eventually reached. 

The divide-and-query strategy was presented in [18] and has been included also 
in the system TkCalypso [20]. As in the top-down strategy, debugging starts 
with a computation tree whose root is not valid. The idea is to choose a node 
N such that the number of nodes inside and outside of the subtree rooted by N 
are the same. Although such node (called the center of the tree) does not exist 
in most of the cases, the system looks for the node that better approximates the 
condition. Then the user is queried about the validity of the basic fact labelling 
this node. If the node is non- valid its subtree will be considered at the next step. 
If it is valid then its subtree is deleted from the tree and debugging continues. 
The process ends when the subtree considered has been reduced to a single 
non- valid node. 

Is easy to observe that, as in the top-down strategy, the number of nodes in the 
tree T considered is reduced after each step, and that nonvalid{root{T)) holds. 
To check that the strategy will find some buggy node we must examine the two 
actions that it can perform depending on the validity of the selected node N . 
First, if nonvalid{N), substituting the whole tree by subtree{T,N) is safe due 
to property PI. If valid{N) we must ensure that the deleting of subtree{T, N) 
will not delete all the buggy nodes. This holds again by Property PI, since the 
tree must have a buggy node B with a path of non-valid nodes from root(T) to 
B. Therefore N cannot be part of this path and B is not in subtree{T^ N). 

Since these strategies modify the structure of the tree, WT includes options to 
save and load computation trees (see the options of the menu ’’File”). The files 
are stored in XML format. These options can be also used to restore a previous 
version of the debugging session if the user realizes after some steps that she or 
he made a mistake when providing information about the validity of the nodes, 
a situation that often arises. 

Figure 4 shows a comparison of the number of steps and the number of nodes 
examined (between round brackets) during some debugging sessions with the 
two strategies. The first row of the table shows the total number of nodes of 
the computation tree considered in each example. Goal G\ corresponds to our 
example of Figure 1 taking the 100 first approximations of the golden ratio. 
Goal G 2 uses a buggy program for computing prime numbers presented in [6]. 
G3 uses a program with arithmetic in Peano’s representation. G4 corresponds to 
a program for sorting numbers using a functional-logic programming technique 
called ’’lazy generate-and-test” in [8]. Goal G5 uses a program for the symbolic 
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derivation of expressions, while Ge corresponds to a program implementing a 
queue. 

The table shows a clear advantage of the divide-and-query strategy w.r.t. the 
top-down strategy. In general, the number of steps in a debugging session with a 
CT of size n is 0{log n) when using the divide and query and 0{n) when using 
the top-down strategy. However, these are worst-case estimations. The top-down 
strategy can behave more efficiently whenever there is a buggy node close to the 
root as it is the case for goal G 3 . 

The source code T>T>T consists of 2900 lines of Java code. We have used the Java 
language for two reasons: firstly, Java provides several libraries for designing 
graphical interfaces, and in particular some specific classes for representing trees 
graphically, and secondly because the Prolog system in which TOy is based, 
SICStus Prolog [19], includes an interface for interacting with Java. 

5 Simplification of Computation Trees 

Next we present two techniques incorporated in 'D'DT that can provide auto- 
matically information about the validity of some nodes in the computation tree. 

5.1 Entailment 

In [5,6] we presented an entailment relation between basic facts based on the 
approximation ordering C defined in Section 2.1. A basic fact f tn t entails 
another basic fact f Sn s (written as f tn t ^ f Sn s) iS there is some 

total substitution 9 G Subst such that ti9 C si, ... , tl s„, s G . 

Notice that the entailment property is covariant in the arguments but con- 
travariant in the result. For instance, considering the basic facts of Figure 2, it 
can be easily proved that 

fib [1,1, 2, 3.4.5 I T ] ^ fib [1.1, 2.3, 4 | T ] 
fibAux 1 2 -> [3,4,5 | T] ^ fibAux 1 2 ^ [3,4 | T] 

with 6 as the identity substitution in both cases. As shown in [6], entailment 
between basic facts is a decidable relation, and intended program models are 
closed under entailment, i.e. if / — > f ^ — > s and (/ — > t) € X then 

if Sn s) G T. This means that if / t is valid then f Sn s will be also 
valid, and conversely that if / Sji — > s is not valid then / t is not valid. 
D'DT uses this property for changing automatically the state of some nodes 
when the user provides information about others. For instance when the node 
containing fib — > [1,1, 2 , 3, 4, 5 | T ] is marked as valid, the state of the node 
containing fib — > [1,1, 2 , 3, 4 | T ] will be changed accordingly to valid, while 
marking the node corresponding to fibAux 1 2 — > [3,4 | T] as nonvalid will 
automatically change to nonvalid of the state of the node containing fibAux 1 2 
- [3.4,5 I Tj. 

Every basic fact obviously entails itself. Therefore, any user-given change in the 
state of a node propagates automatically to all the other nodes containing the 
same basic fact. 
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5.2 Trusted Specifications 

As explained in previous sections, WT users can mark some CT nodes as 
trusted during a debugging session. All the nodes whose basic facts correspond 
to the function used at the trusted node are automatically considered as valid. 
A more automatic way of declaring trusted functions is by means of trusted 
specifications. This idea is not new and was introduced in the seminal work of 
E.Y. Shapiro [18]. 

We say that a T Oy program S' is a trusted specification if the user assumes as 
valid all the basic facts that can be derived from S. Let P be a buggy program, 
T a CT corresponding to some initial symptom for P, and Sp some trusted spec- 
ification for some of the functions occurring in P. Then the following procedure 
can be used to provide some information about the validity of the nodes in T : 

For each basic fact (p : f tn t labelling some node N of T: 

If valid7{Sp, ip) = yes then delete N. 

If valid! {Sp, p) = no then mark N as non-valid. 

If valid!{Sp, p) = don’t-know then mark N as unknown. 

T>T>T incorporates an algorithm for computing valid! {Sp, p) in a correct way. If 

the result is yes then p can be derived from Sp and deleting N is safe because of 
Property P2. If the result is no then p cannot be derived from Sp and marking 
N as non-valid is correct. Otherwise N is marked as unknown. There are two 
possible situations where the algorithm returns don’t-know: 

- If the function used in p is not defined in Sp. 

- If the time required for deciding if p can be derived from Sp exceeds a 
certain time-out constant. This is done to avoid possible problems of non- 
termination, since the set of basic facts derivable from a given program is 
undecidable in general. 

In each debugging session UDT asks the user whether this simplification should 
be performed. If the answer is affirmative the tool asks for the name of the 
Toy program which contains the trusted specification, and simplifies the tree 
before the navigation phase. 

For instance, the program of Figure 5 is a trusted specification for the example 
program of the golden ratio, using a different method for generating the infinite 
list with the Fibonacci sequence. For saving space we don’t include the definitions 
of take, tail, goldenApprox, ./. and main which are the same of the Figure I. 

We can imagine this program as the first, naive, solution for the problem of the 
golden ratio approximations programmed by the user. It works correctly, but 
the generation of Fibonacci numbers is quite inefficient. Then the example of 
Figure 1 could be an attempt of improving the efficiency of this program. After 
trying the new version the user could observe that it returns a different answer, 
and decide that the first naive version was more likely to be correct. Then the 
declarative debugger could be started using this first version of Figure 5 as a 
trusted specification. The simplification will delete 12 nodes of the computation 



T>T>T : a Declarative Debugging Tool for Functional-Logic Languages 



83 



fibN 1 = 1 

fibN 2 = 1 

fibN N = if N>2 then (fibN (N-l))-h(fibN (N-2)) 

fib = map fibN (from 0) 



map F 
map F 



X|Xs 



F X I map F Xs] 



from N = [N I from N+1] 



Fig. 5. A Trusted Specification for the program of Figure 1 



tree and mark 3 nodes more as non valid, hence reducing the number of unknown 
nodes in the initial CT from 23 to only 8. 

6 Conclusions 

In this paper we have described the declarative debugging tool VDT, which is 
part of the functional- logic system TOy. In comparison to the traditional top- 
down declarative debuggers, 'D'D'T gives more support for avoiding the complex- 
ity of oracle questions. This can be achieved either by skillful free CT navigation, 
or by using a divide-and-query navigation strategy. Additionally, 'D'DT also of- 
fers two useful techniques for simplifying the CT prior to navigation. 

In contrast to other debugging tools (as e.g. the recent visual debugger for Mer- 
cury [3]) VT>T is an off-line tool: the computation tree must be completely 
generated before it can be displayed and navigated. Unfortunately, complete 
CT generation causes a considerable overhead w.r.t. to the original computation 
which led to the debugging session, both in terms of time and space resources. 
Related works on the implementation of declarative debuggers for lazy functional 
languages [14,17] have proposed techniques for reducing the computational over- 
head caused by debugging. As far as we know, this kind of techniques have been 
worked out only for the top-down navigation strategy. They mainly rely on a 
lazy generation of the CT as demanded by navigation. 

In spite of the computational overhead, we still believe that WT offers better 
facilities for CT simplification and navigation, which means a crucial advantage 
in CTs with a large number of nodes, where top-down navigation produces too 
many (maybe complex) questions. As future work, we plan to revise the im- 
plementation of the WT tool, looking for incremental CT simplification and 
navigation methods that can be made compatible with lazy CT generation. 

Acknowledgements The authors are grateful to Wolfgang Lux and Francisco 
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Abstract. This paper presents a self-applicable partial evaluator for 
a considerable subset of full Prolog. The partial evaluator is shown to 
achieve non-trivial specialisation and be effectively self-applied. The at- 
tempts to self-apply partial evaluators for logic programs have, of yet, not 
been all that successful. Compared to earlier attempts, our Lix system 
is practically usable in terms of efhciency and can handle natural logic 
programming examples with partially static data structures, built-ins, 
side-effects, and some higher-order and meta-level features such as call 
and findall. The lix system is derived from the development of the 
LOGEN compiler generator system. It achieves a similar kind of efficiency 
and specialisation, but can be used for other applications. Notably, we 
show first attempts at using the system for deforestation and tupling 
in an offline fashion. We will demonstrate that, contrary to earlier be- 
liefs, declarativeness and the use of the ground representation is not the 
best way to achieve self-applicable partial evaluators. Keywords: Par- 
tial Evaluation, Self-application, Logic Programming, Partial Deduction, 
Deforestation, Tupling. 



1 Introduction and Summary 

Partial evaluation has received considerable attention over the past decade both 
in functional (e.g. [16]), imperative (e.g. [2]) and logic programming (e.g. [9, 18, 
25]). In the context of pure logic programs, partial evaluation is often referred to 
as partial deduction, the term partial evaluation being reserved for the treatment 
of impure logic programs. We will adhere to this convention in this paper. 

Guided by the Futamura projections (see e.g. [16]) a lot of effort, especially in 
the functional partial evaluation community, has been put into making systems 
self-applicable. A partial evaluation or deduction system is called self- applicable 
if it is able to effectively^ specialise itself. The practical interests of such a ca- 
pability are manifold. The most well-known are related to the second and third 
Futamura projections [7]. The first Futamura projection consists of specialising 
an interpreter for a particular object program, thereby producing a specialised 

^ This implies some efficiency considerations, e.g. the system has to terminate within 
reasonable time constrains, using an appropriate amount of memory. 
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version of the interpreter which can be seen as a compiled version of the object 
program. If the partial evaluator is self-applicable then one can specialise the 
partial evaluator for performing the first Futamura projection, thereby obtain- 
ing a compiler for the interpreter under consideration. This process is called 
the second Futamura projection. The third Futamura projection now consists 
of specialising the partial evaluator to perform the second Futamura projection. 
By this process we obtain a compiler generator {cogen for short). 

History of Self-application for Logic Programming Not surprisingly, writ- 
ing an effectively self-applicable specialiser is a non-trivial task — the more 
features one uses in writing the specialiser the more complex the specialisation 
process becomes, as the specialiser then has to handle these features as well. For 
a long time it was believed that in order to develop a self-applicable specialiser 
for logic programs one needed to write a clean, pure and simple specialiser. In 
practice, this meant using few (or even no) impure features in the implemen- 
tation of the specialiser. For this the ground representation [14] was believed 
to be key, in which variables of the source program are represented by ground 
constants within the specialiser. Indeed, the ground representation allows one to 
freely manipulate the source program to be specialised in a declarative manner. 
The non-ground representation, where source-level variables are represented as 
variables in the program specialiser, can suffer from semantical problems [23] and 
requires some non-declarative features (such as findall/3) in order to perform 
the specialisation. 

Some early attempts at self-application [6] used the non-ground represen- 
tation, but the self-applying led to incorrect results as the specialiser did not 
properly handle the non-declarative constructs that were employed in its imple- 
mentation^. Other specialisers like MIXTUS [27], paddy [26] and ecce [22] use 
the non-ground representation, but none of them are able to effectively specialise 
themselves. 

The ground representation approach towards self-application was pursued in 
[3], [19], [24], and [4, 12, 13] leading to some self-applicable specialisers: 

- SAGE [12], a self-applicable partial evaluator for Godel. While the speedups 
obtained by self-application are respectable, the process takes a very long 
time (several hours) and the obtained specialised specialisers are still ex- 
tremely slow. This is probably due to the explicit unification algorithm re- 
quired by the ground representation. To effectively specialise this much more 
powerful specialisation techniques would be required to obtain reasonably 
efficient specialisers. Similar performance problems were encountered in the 
earlier work [3]. 

- LOGiMix [16, 24], a self-applicable partial evaluator for a subset of Prolog, 
including if-then-else, side-effects and some built-in’s. LOGiMix uses a meta- 
interpreter (sometimes called InstanceDemo) for the ground representation 
in which the goals are “lifted” to the non-ground representation for resolu- 
tion. This avoids the use of an explicit unification algorithm, at the expense 

A problem mentioned in [3], see also [24, 19]. 
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of some power Unfortunately, logimix gives only modest speedups (when 
compared to results for functional programming languages, see [24]), but 
it was probably the first practical self-applicable specialiser for a logic pro- 
gramming language. 

Given the problem in developing a truly practical self-applicable specialiser 
for logic programs, the attention shifted to the cogen approach [15]: instead of 
trying to write a partial evaluation system which is neither too inefficient nor too 
difficult to self-apply, one simply writes a compiler generator directly. Indeed, 
the actual creation of the cogen according to the third Fiitamura projection is 
in general not of much interest to users since the cogen can be generated once 
and for all when a specialiser is given. This approach was pursued in [17, 21] 
leading to the logen system, which can produce specialised specialisers much 
more efficiently than any of the self-applicable systems mentioned above. The 
resulting specialisers themselves are also much more efficient. 

A New Attempt at Self-application In a sense the cogen approach has closed 
the practical debate on self-application for logic programming languages: one 
can get most of the benefits of self-application without writing a self-applicable 
specialiser. Still, there is the question of academic curiosity: is it really impossible 
to derive the cogen written by hand in [17, 21] by self-application? Also, having a 
self-applicable specialiser is sometimes more flexible as we may generate different 
cogen’s for different purposes (such as one with debugging enabled). One may 
produce more or less optimised cogen’s by tweaking the specialisation process, 
and better control the tradeoff between specialisation time and quality of the 
optimised code. Maybe there are other situations where a self-applicable partial 
evaluation system is preferrable to a cogen: Gluck’s specialiser projections [10] 
and the semantic modifiers of Abramov and Gliick [1] may be such a setting. 

This paper aims to answer some of these questions. Indeed, after the de- 
velopment of LOGEN we realised that one could translate LOGEN into a classical 
partial evaluator without too much difficulty. Furthermore, using new annotation 
facilities developed for the second version of logen [21], one can actually make 
this partial evaluator (henceforth called Lix) self-applicable. By self-applying 
LIX we obtain generating extensions via the second Futamura projection which 
are very similar to the ones produced by LOGEN and the cogen obtained via the 
third Futamura projection also has lot of similarities to the code of logen. The 
performance of this self-applicable partial evaluator is (after self-application) on 
par with LOGEN, and is thus much faster than any of the previous self- applicable 
logic programming specialisers. In the paper we also show some potential prac- 
tical applications of this self-applicable specialiser. 

The code of the specialiser itself is also surprisingly simple, but uses a few 
non-declarative features and does not use the ground representation. So, contrary 
to earlier belief, declarativeness and the ground representation were not the best 
way to climb the mountain of self-application. Indeed, the use of the non-ground 
representation makes our partial evaluator much more efficient and avoids all 

® This idea was first used by Gallagher in [8, 9] and then later in [20] to write a 
declarative meta-interpreter for integrity checking in databases. 
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the complications related to specialising an explicit unification algorithm. The 
only drawback is that to safely deal with the non-ground representation, our 
partial evaluator needs to use some non-declarative features such as findall, 
and hence also has to be able to specialise them. Fortunately, this turned out to 
be less of a problem than anticipated. 

In summary, Futamura’s insight was that a cogen could be derived by a 
self-applicable specialiser. The insight in [15] was that a cogen is just a simple 
extension of a binding-time analysis, while our insight is that an effective self- 
applicable specialiser can be derived by transforming a cogen. 

2 The Partial Evaluator 

Logen and Lix are both offline partial evaluators. An offline partial evaluator 
works on an annotated version of the source program, these annotations are used 
to guide the specialisation process. There are two kinds of annotations: 

- filter declarations, indicating whether arguments to predicates are static 
or dynamic. This influences the global control. 

- clause annotations, indicating how every call in the body should be treated 
during unfolding. These influence the local control. 

2.1 The Basic Annotations 

A common annotation format is used for both the LiX and logen systems. Each 
call in the program is annotated using logen/2 and arguments are annotated 
using filter declarations. The head of a clause is annotated with an identifier. 
The format of the annotations is demonstrated in the following append example: 

filter appendCstatic, dynamic, dynamic) . 
logen (app, append ([] ,L,L) ) . 

logenCapp, appendC [H I T] , L, [H|T1])) logen (unfold, append (T, L, T1 ) . 

The first argument to append has been marked as static, it will be known 
at specialisation time, and the other arguments have been marked dynamic. 
The recursive call to append is annotated for unfolding, the first argument is 
known thus guaranteeing termination at specialisation time. Some of the basic 
annotations are: 

- unfold for reducible predicates, they will be unravelled during specialisation. 

- memo for non-reducible predicates, they will be added to the memoisation 
table and replaced with a generalised residual predicate. 

- call fully static call will be made during specialisation. 

- rescall the call will be kept and will appear in the final specialised code. 

2.2 The Source Code 

We now present the main body of the lix partial evaluator^. An atom A is 
specialised by calling lix(A,Res). The memo/2 and memo_table/2 predicates 

^ The LIX system can be downloaded from: 
http : //www. ecs . sot on. ac .uk/~s jc02r/lix/lix .html. 
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return in their second argument a call to a new specialised predicate where static 
arguments have been removed and dynamic ones generalised. Generalisation and 
filtering are performed by general ise_andJi It er/3. It returns in its second 
argument the generalised call (to be unfolded) and in its third argument the 
call to the specialised predicate. It uses the annotations defined by f liter /2 
to perform its task. The predicate gensym/2 is used to create unique names 
for the specialised predicates. The predicate unfold/2 computes the bodies of 
specialised predicates. A call annotated as memo is replaced by a call to the 
specialised version. If it does not already exist it is created by memo/2. A call 
annotated as unfold is further unfolded; a call annotated as call is completely 
evaluated; finally, a call annotated as rescall is added to the residual code 
without modification (for built-ins that cannot be evaluated or code that is 
defined elsewhere). All clauses defining the new predicate are collected using 
f indall/3 and pretty printed. 

Note the use of the global side effect, assert (memo_table (GCall , RCall) ) , 
to maintain the list of previously specialised calls. The univ operator = . . can be 
used either to decompose a term into a list containing its functor and arguments 
or else construct a term from such a list. For example the term f (X,Y) can be 
deconstructed into [f , X , Y] . 

To save space the definition of pretty_print_clauses/l is not given. 

dynamic memo_table/2, flag/2. 
lixCCallToSpecialise , ResidualCall) 

printC’:- dynamic flag/2, memo_table/2 . \n’ ) , 
print ( ’ : - use_module (library (lists) ) . \n’ ) , 
memo(CallToSpecialise, ResidualCall) . 
memo (Call, Residual) 

( memo_table(Call, Residual) -> true 
; generalise_and_f liter (Call, GenCall, ResidualPred) , 
assert (memo_table (GenCall , ResidualPred) ) , 

f indalK (ResidualPred: -Body) , unf old (GenCall, Body ) , Clauses), 
format (’ /*~k=""k*/"'n’ , [ResidualPred, GenCall] ) , 
pretty_print_clauses (Clauses) , memo_table(Call, Residual) 

). 

unfold(Head, Residual) ann_clause (_ , Head, Body) ,pe (Body, Residual). 
pe(true, true). 

pe((A,B), (ResA,ResB)) pe(A, ResA) , pe(B, ResB) . 
pe(logen(call,Call) , true) call (Call) . 
pe(logen(rescall,Call) , Call). 

pe (logen (memo, Call) , Residual) memo(Call, Residual). 
pe(logen(unfold,Call) , Residual) unfold(Call, Residual). 
generalise_and_f liter (Call, GenCall, ResidualPred) 
filter(Call, Filter), Call= . . [HeadI Args] , 
gen_f liter (Filter , Args, GenArgs, ResArgs) , 

GenCall=. . [HeadI GenArgs] , 

gensym(Head, ResHead) , ResidualPred =.. [ResHead I ResArgs] . 
gen_filter([] , [] , [] , [] ) . 

gen_filter( [static I A] , [B|C], [B|D], E) gen_f ilter(A, C, D, E) . 
gen_f liter ( [dynamic I A] , [_|B], [C|D], [C|E]) gen_f ilter(A, B, D, E) . 
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/* code for unique symbol generation, using dynamic flag/2 */ 
oldvalue (Sym, Value) flag(gensym(Sym) , Value), !. 
oldvalue(_, 0). 
set_f lagCSym, Value) 

nonvar(Sym), retract(flag(Sym,_)) , !, asserta(flag(Sym, Value)) . 

set_f lagCSym, Value) nonvar(Sym) , asserta(flag(Sym, Value) ) . 
gensym(Head, ResidualHead) 

var(ResidualHead) , atom(Head) , oldvalue (Head, QldVal) , 

NewVal is OldVal+1, set_f lag(gensym(Head) , NewVal) , 

name(A , " "), string_concat (Head, A , Head ), 

string_concat (Head , NewVal, ResidualHead). 

append ([], A, A). 

append( [A I B] , C, [A|D]) append (B, C, D) . 

string_concat(A, B, C) name(A, D) , name(B, E) , 

append(D, E, F) , name(C, F) . 

/* Clause Database: automatically created from annotated file */ 
ann_clause (1 , app([],A,A), true). 

ann_clause(2, app( [A|B] ,C, [A|D] ) , logen(memo, app(B,C,D) ) ) . 
filter(app(_,_,_) , [dynamic, static, dynamic] ) . 



2.3 Deriving Lix from logen 

The LIX partial evaluator was created by transforming the Logen compiler 
generator. The basic insight was that it is possible to create a classical partial 
evaluator that when specialised would produce similar generating extensions. 
Let us compare a small extract of code from both LOGEN and lix, dealing with 
the call and rescall annotations: 



body (logen (call ,Call) , Call , true) . 
body(logen(rescall,Call) , true, Call) . 


pe (logen(call ,Call) , true) call(Call). 

pe(logen(rescall,Call) , Call) true. 


Logen 


Lix 



The body predicate is explained in detail in [21]. Basically, the first argument 
is an annotated call, the second argument is the code that will appear in the gen- 
erating extension and the third argument denotes the specialised code. We can 
see that the middle argument from body/3 in LOGEN has been transformed into 
a call in the lix version. This call is annotated as residual for self-application, 
and will hence appear in the generating extension produced by self-application. 
A more detailed comparison of the generating extensions and the produced cogen 
can be found in Section 5. 



2.4 Specialised Code 

To specialise code we use the lix/2 entry point. Calling lix (app (A, [b] ,C) ,Res) 
specialises the append predicate to append [b] to the end of a list: 



app 1([], [b]). 

app__l([A|B] , [AlC]) app__l(B, C) . 
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The generation of the above code took 0.318 ms^. This is a very simple example 
to demonstrate the partial evaluator. The specialisation of a non-trivial Vanilla 
debugging interpreter and other examples can be found on the lix homepage®. 

3 Towards Self-application 

We have presented the main body of the code for the lix system. For a partial 
evaluator to be self-applicable it must be able to effectively handle all of the 
features it uses. The system we have presented so far uses a few non-declarative 
features and does not use the ground representation. In this section will shall 
introduce the required extension to make lix self-applicable. 

3.1 The Nonvar Binding- Type 

We now present a new feature derived from logen which is useful when special- 
ising interpreters. This annotation will be the key for effective self-application. 

In addition to marking arguments to predicates as static or dynamic, it is 
also possible to use the binding-type nonvar. This means that this argument is 
not a free variable and will have at least a top-level function symbol, but it is not 
necessarily ground. For example f (X) , f (a) and f are all nonvar but the variable 
X is not. During generalisation, the top level function symbol is kept but all its 
sub-arguments are replaced by fresh variables. For filtering, every sub-argument 
becomes a new argument of the residual predicate. 

A small example will help to illustrate this annotation: 

filter p (nonvar) . 

p(f(X)) :-p(g(a)). p(g(X)) :-p(h(X)). 

p(h(a)). p(h(X)) :-p(f(X)). 

If we mark no calls as unfoldable, we get the following specialised program 
for the call p (f (Z) ) : 

"/,"/o"/, entry point: p(f(Z)) p 0(Z) 

P__0(B) p__l(a). p__l(B) p__2(B) . 

p__2(a). p__2(B) p__0(B). 

If we mark everything except the last call as unfoldable we obtain: 

P__0(B). 

p 0(B) p 0(a) . 

The gen_f liter /2 predicate in the lix source code is extended to handle 
the nonvar annotation: 

gen_f liter ( [nonvar I A] , [B|C], [D|E], F) 

B=..[G|H], length(H, I), length! J, I), 

D=..[G|J], gen_f ilter(A, C, E, K) , append(J, K, F) . 

® Benchmarks performed using SICStus Prolog 3.10.1 for Linux on a Pentium 2.4GHz 
with 512MB RAM. 

® http : //www. ecs . sot on. ac .uk/~s jc02r/lix/lix .html 
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3.2 Treatment of findall 

In Lix findall is used to collect the clauses when unfolding a call; hence we 
have to be able to treat this feature during specialisation. 

Handling findall is actually not much different from handling negation in 
[21]. There is a static version (findall), in which the call is executed at special- 
isation time, and a dynamic version (resf indall), where it is executed at run- 
time. In both cases, the second argument must be annotated. For resf indall, 
much like resnot in [21], the annotated argument should be deterministic and 
should not fail (which can be ensured by wrapping the argument into a hide_nf 
annotation, see [21]). Also, if a findall is marked as static then the call should 
be sufficiently instantiated to fully determine the list of solutions. The following 
code is used in the subsequent examples: 

filter all_p(static , dynamic) . 
all_p(X,Y) findall (X,p(X) ,Y) . 

filter p(static) . 
p(a) . p(b) . 

If the findall is marked as residual and we memo p(X) inside it then the 
specialised program for all_p(a,Y) is: 

all_p 0(A) findall(a,p 1,A). 

P— 1- 

If we mark p(X) as unfold we get: 
all_p 0(A) f indall (a, true, A) . 

For self-application, only resf indall is actually required. The pe/2 predi- 
cate is extended as follows: 

pe(resfindall(Vars,G2,Sols) , f indall (Vars ,VS2, Sols) ) 
pe(G2,VS2) . 

3.3 Treatment of if 

In the LIX code an if-then-else is used in memo/2. In this case the if is 
dynamic, the body of the conditional will be computed, along with those of the 
branches and an if statement will be constructed in the residual code. Lix is 
also extended to handle a static if which is performed at specialisation time. 



pe(resif (A,B,C) , (D->E;F)) pe(A, D) , pe(B, E) , pe(C, F) . 
pe(if (A,B,C) , D) (pe(A, _) -> pe(B, D) ; pe(C, D)). 
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3.4 Handling the Cut 

This is actually very easy to do, as with careful annotation the cut can be 
treated as a normal built-in call. The cut must be annotated using call, where 
it is performed at specialisation time, or rescall, where it is included in the 
residual code. It is up to the annotator to ensure that this is sound, i.e. Lix 
assumes that: 

- if a cut marked call is reached during specialisation then the calls to the left 
of the cut will never fail at runtime. 

- if a cut is marked as rescall within a predicate p, then no calls to p are 
unfolded. 

These conditions are sufficient to handle the cut in a sound, but still useful 
manner. 

4 Self-application 

Using the features introduced in Section 3 and the basic annotations from Sec- 
tion 2.1, LIX can be successfully annotated for self-application. Self-application 
allows us to achieve the Futamura projections mentioned in the introduction. 

4.1 Generating Extensions 

In Section 2.4 we specialised app/3 for the call app(A, [b] ,C) . If a partial evalu- 
ator is fully self-applicable then it can specialise itself for performing a particular 
specialisation, producing a generating extension. This process is the second Fu- 
tamura projection. When specialising an interpreter the generating extension is 
a compiler. 

A generating extension for the append predicate can be created by calling 
lix(lix(app(A,B,C) ,R) ,R1), creating a specialised specialiser for append. 

/♦Generated by Lix*/ 

dynamic flag/2, memo_table/2 . 

/* oldvalue__l(_5557,_5586) = oldvalue(_5557,_5586) */ 

oldvalue 1(A, B) flag(gensym(A) , B) , !. 

oldvalue 1(_, 0). 

/* set_flag__l(_7128,_7153) = set_f lag(gensym(_7128) , _7153) */ 

set_flag 1(A, B) retract(flag(gensym(A) , !, 

assertaCf lag(gensym(A) ,B)) . 
set_flag 1(A, B) asserta(flag(gensym(A) ,B)) . 

/* gensym 1(_4392) = gensym(app,_4392) */ 

gensym 1(A) var(A) , oldvalue l(app, B) , 

C is B+l,set_flag l(app, C) , 

name(C, D) , name(A, [97 , 112 , 112 ,95,95 I D] ) . 

/* Printing and Flatten Clauses removed to save space */ 
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/* unfold__l(_6925,_6927,_6929,_6956) = 

unfold(app(_6925,_6927,_6929) ,_6956) */ 

unfold 1([], A, A, true). 

unfold__l( [A|B] , C. [A|D], E) memo__l(B. C, D, E) . 

/* memo__l(_2453,_2455,_2457,_2484) = 

memo (app (_2453 , _2455 , _2457) , _2484) */ 

memo 1(A, B, C, D) 

( memo_table(app(A,B,C) , D) -> true 

; gensym 1(E), F=..[E,G,H], 

assert (memo_table(app(G,B,H) ,F)) , 
findall((F:-I) , unfold__l(G,B,H,I) , J) , 
format ( ’ /*~k=""k*/"'n’ , [F, app(G,B,H)] ) , 

pretty_print_clauses 1(J) , 

memo_table(app(A,B,C) , D) 

). 

/* lix__l(_1288,_1290,_1292._1319) = lix(app(_1288,_1290,_1292) ,_1319) */ 
lix 1(A, B, C, D) memo 1(A, B, C, D) . 

This is almost entirely equivalent to the proposed specialised unfolders in 
[17, 21]. It is actually slightly better as it will do flow analysis and only generate 
unfolders for those predicates that are reachable from the query to be specialised. 
Note the gensym/2 predicate is specialised to produce only symbols of the form 
app_N. Generation of the above took 3.3 ms. 

The generating extension for append can be used to specialise the append 
predicate for different sets of static data. Calling the generating extension with 

lix 1 (A, [b] ,C,R) creates the same specialised version of the append predicate 

as in seciton 2.4: 

app 1([], [b]). 

app__l([A|B] , [A I Cl) app__l(B, C) . 

However using the generating extension is faster, for this small example 0.212 
ms instead of 0.318 ms. Using a larger benchmark, unfolding (as opposed to 
memoising) the append predicate for a 10, 000 item list produces more dramatic 
results. To generate the same code the generating extension takes 40 ms com- 
pared to 990 ms for lix. The overhead of creating the generating extension for 
the larger benchmark is only 10 ms. Generating extensions can be very efficient 
when a program is to be specialised multiple times with different static data. 

4.2 Lix Compiler Generator 

The third Futamura projection is realised by specialising the partial evalua- 
tor to perform the second Futamura projection. By this process we obtain a 
compiler generator (cogen for short), a program that transforms interpreters 
into compilers. By specialising lix to create generating extensions we create 
LIX-COGEN, a self-applied compiler generator. This can be achieved with the 
query lix(lix(lix(Call,R) ,R1) ,R2). An extract from the produced code is 
now given: 
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/♦unfold 13 (Annotation, Generated Code, Specialisation Time) */ 

unfold 13(true, true, true). 

unfold__13((A,B) , (C,D), (E,F)) 
unfold__13(A, C, E), 
unfold__13(B, D, F). 

unfold 13(logen(call,A) , true, call(A)). 

unfold 13(logen(rescall, A) , A, true). 

This has basically re-generated the 3-level cogen described in [17, 21]. In the 
rescall annotation for example, the call (A) will become part of the residual 
program, and nothing (true) is performed at specialisation time. 

This code extract demonstrates the importance of the nonvar annotation. 
The annoated version of the original unfold/2 is now shown. 

filter unfold(nonvar , dynamic) :unfold. 
logen(unf old, unfold(X,Code)) 

logen(unf old, ann_clause(_ ,X,B) ) , 
logen(unf old, pe(B,Code)). 

Without the nonvar annotation the first argument would be annotated dy- 
namic as the arguments to the call being unfold may not be known at special- 
isation time. This would produce a single generic unfolder predicate much like 
the original lix. The nonvar annotation is needed to generate the specialised 
unfolders. 

The generated lix-COGEN will transform an annotated program directly into 
a generating extension, like the one found in section 4.1. However lix-COGEN is 
faster: to create the same generating extension from an input program of 1, 000 
predicates lix-COGEN takes only 3.9 s compared to 100.9 s for lix. 

5 Comparison 

Logen The logen system is an offline partial evaluation system using the 
cogen approach. Instead of using self-application to achieve the third Futamura 
projection, the LOGEN compiler generator is hand written. LiX was derived from 
LOGEN by rewriting it into a classical partial evaluation system. Using the second 
Futamura projection and self-applying lix produces almost identical generating 
extensions to those produced by LOGEN (and both systems can in principle treat 
full Prolog) . Apart from the predicate names the specialised imfolders generated 
by the two systems are the same: 



app u( [] , A , A , true) . 

app__u([A|B] ,C, [AID] ,E) 
app m(B,C,D,E) . 


unfold 1 ( [] , A , A 

unfold__l([A|B] , C 
memo 1(B, C, D, 


true) . 

[AID], E) :- 
E) . 


Logen Generating Extension 


Lix-COGEN Generating Extension 



While LOGEN is a hand written compiler generator, lix must be self-applied 
to produce the same result as in Section 4.2. If we compare the LOGEN source 
code to the output in Section 4.2 we find very similar clauses in the form of 
body/3 (note however, that the order of the last two arguments is reversed). 
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body (true , true , true) . 
bodyC(G,GS) , (G1,GS1) , CV,VS)) 



unfold 13(true, true, true). 

unfold__13((A,B) , (C,D), (E,F)) 



body(G,Gl,V) , 



unfold__13(A, C, E) , 



body(GS,GSl,VS) . 

body (logen (call ,Call) , Call , true) . 
body(logen(rescall,Call) , true, Call) . 



unfold__13(B, D, F) . 

unfold 13 (logen(call , A) , true, call(A)). 

unfold 13(logen(rescall,A) , A, true). 



Logen 



Lix-cogen 



Unlike Lix, logen does not perform flow analysis. It produces unfolders for all 
predicates in the program, regardless of whether or not they are reachable. 

Logimix and Sage Comparisons of the initial cogen with other systems such 
as LOGIMIX, PADDY, and SP can be found in [17]. In essence, LOGEN was was 50 
times faster than logmix at producing the generating extensions (0.02 s instead 
of 1.10 s or 0.02 s instead of 0.98 s) and the specialisation times were about 
2 times faster. It is likely that a similar relationship holds between lix and 
logimix given that lix and LOGEN have similar performance. Unfortunately 
logimix no longer runs on current versions of SICStus Prolog and we were thus 
unable to compare lix and logimix directly. Similarly, Godel no longer runs 
on current versions of SICStus Prolog, and hence we could not produce any 
timings for SAGE. However, timings from [12] indicate that the use of the ground 
representation means that SAGE is far too slow to be practical. Indeed, generating 
the compiler generator took about 100 hours and creating a generating extension 
for the examples in [12] took at least 7.9 hours. The speedups from using the 
generating extension instead of the partial evaluator range from 2.7 to 3.6 but 
the execution times for the generating extensions still ranged from 113 s to 447 s. 

Multi-level Languages Our annotation scheme (for both lix and logen) can 
be viewed as a two-level language. Contrary to MetaML [28] our annotations 
are not part of the programming language itself (as we treat classical Prolog) . It 
would be interesting to investigate to what extent one could extend our scheme 
for multiple levels of specialisation [11]. 

6 New Applications 

Apart from the academic satisfaction of building a self-applicable specialise!', we 
think that there will be practical applications as well. We elaborate on a few in 
this section. 

Several Versions of the Cogen In the development of new annotation and 
specialisation techniques it is often useful to have a debugging specialisation 
environment without incurring any additonal overhead when it is not required. 
Using LIX we can produce a debugging or non-debugging specialiser from the 
same base code, the overhead of debugging being specialised away when it is not 
required. By augmenting Lix with extra options we can produce several versions 
of the cogen depending on the requirements: 
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- a debugging cogen, useful if the specialisation does not work as expected 

- a profiling cogen 

- a simple cogen, whose generating extensions produce no code but which can 
be fed into termination analysers or abstract interpreters to obtain informa- 
tion to check the annotations. 

We could also play with the annotations of Lix to produce more or less 
aggressive specialisers, depending on the desired tradeoff between specialisation 
time, size of the specialised code and the generating extensions, and quality of the 
specialised code. This would be more flexible and maintainable than re-writing 
LOGEN to accomodate various tradeoffs. 

Extensions for Deforestation/Tupling Lix is more flexible than LOGEN: 
we do not have to know beforehand which predicates are susceptible to being 
unfolded or memoised. Hence, lix can handle a potentially unbounded number 
of predicates. Using this allows lix to perform a simple form of conjunctive 
partial deduction [5]. 

For example, the following is the well known double append example where 
conjunctive partial deduction can remove the unnecessary intermediate datas- 
tructure XY (this is deforestation): 

doubleapp(X,Y,Z,XYZ) append(X,Y,XY) , append ( XY, Z. XYZ ) . 

append ( [] ,L,L) . 

append([H|X] ,Y, [H|Z]) append(X,Y,Z) . 

When annotating this example for lix we can now simply annotate a con- 
junction as memo (which is not allowed in logen): 

ann_clause (1 ,doubleapp(A,B,C,D) , (memo( (append (A, B,E) , append (E, C ,D) ) ) ) ) . 

Running LIX on this will produce a result where the intermediate datastruc- 
ture has been removed (after post-processing, as in [5]): 

doubleapp(A,B,C,D) doubleapp 0(A,B,C,D). 

append 2( [] ,B,B) . 

append 2( [C I D] ,E, [C I F] ) append 2(D,E,F) . 

conj 1( [] , [] ,B,B) . 

conj 1 ( [] , [C I D] ,E, [C I F] ) append 2(D,E,F) . 

conj__l([G|H] ,I,J, [G|K]) conj__l (H, I , J ,K) . 
doubleapp 0(B,C,D,E) conj 1(B,C,D,E). 

For this example to work in LOGEN we would need to declare every possible 
conjunction skeleton beforehand, as a specialised unfolder predicate has to be 
generated for every such conjunction, lix is more flexible in that respect, as it 
can unfold a conjunction even if it has not been declared before. 

We have also managed to deal with the rotate-prune example from [5], but 
more research will be needed into the extent that the extra flexibility of lix 
can be used to do deforestation or tupling in practice. It should be possible, for 
example, to find out whether there is a bounded number of conjunction skeletons 
simply by self-application. 
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7 Conclusions and Future Work 

We have presented an implemented, effective and surprisingly simple, self-ap- 
plicable partial evaluation system for Prolog and have demonstrated that the 
ground representation is not required for a partial evaluation system to be self- 
applicable. The Lix system can be used for the specialisation of non-trivial in- 
terpreters, and we hope to extend the system to use more sophisticated binding 
types developed for logen. 

While LIX and LOGEN essentially perform the same task, there are some 
situations where a self-applicable partial evaluation system is preferrable. Lix 
can potentially produce better generating extensions, using specialised versions 
of gensym and performing some of the generalisation and filtering beforehand. 
We have shown the potential for the use of lix in deforestation, and in producing 
multiple cogens from the same code. Tweaking the annotation of lix allows the 
cogen generation to be controlled. The overhead of a debugging cogen can be 
removed or a more aggressive specialiser can be generated. 

At present the annotations for lix and LOGEN are placed by hand. We are still 
working on a fully automatic binding time analysis (bta) . The automatic bta will 
be used with a graphical interface allowing the user to tweak the annotations. 
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Abstract. Non-failure analysis aims at inferring that predicate calls in 
a program will never fail. This type of information has many applica- 
tions in functional/logic programming. It is essential for determining 
lower bounds on the computational cost of calls, useful in the context 
of program parallelization, instrumental in partial evaluation and other 
program transformations, and has also been used in query optimization. 
In this paper, we re-cast the non-failure analysis proposed by Debray 
et al. as an abstract interpretation, which not only allows to investigate 
it from a standard and well understood theoretical framework, but has 
also several practical advantages. It allows us to incorporate non-failure 
analysis into a standard, generic abstract interpretation engine. The ana- 
lysis thus benefits from the fixpoint propagation algorithm, which leads 
to improved information propagation. Also, the analysis takes advantage 
of the multi- variance of the generic engine, so that it is now able to infer 
separate non-failure information for different call patterns. Moreover, the 
implementation is simpler, and allows to perform non-failure and cover- 
ing analyses alongside other analyses, such as those for modes and types, 
in the same framework. Finally, besides the precision improvements and 
the additional simplicity, our implementation (in the Ciao/CiaoPP mul- 
tiparadigm programming system) also shows better efficiency. 



1 Introduction 

Non-failure analysis involves detecting at compile time that, for any call belong- 
ing to a particular (possibly infinite) class of calls, a predicate will never fail. As 
an example, consider a predicate defined by the following two clauses: 
abs(X, Y) X >= 0, Y is X. 

abs(X, Y) X < 0, Y is -X. 

and assume that we know that this predicate will always be called with its 
first argument bound to an integer, and the second argument a free variable. 
Obviously, for any particular call, one or the other of the tests X >= 0 and 
X < 0 may fail; however, taken together, one of them will always succeed. Thus, 
we can infer that calls to the predicate will never fail. 

Being able to determine statically that a predicate will not fail has many 
applications. It is essential for determining lower bounds on the computational 
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cost of goals since without such information a lower bound of almost zero (corre- 
sponding to an early failure) must often be assumed [10]. Detecting non-failure 
is also very useful in the context of parallelism because it allows avoiding unnec- 
essary speculative parallelism and ensuring no-slowdown properties for the par- 
allelized programs (in addition to using the lower bounds mentioned previously 
to perform granularity control) [11]. Non- failure information is also instrumental 
in partial evaluation and other program transformations, such as reordering of 
calls, and has also been used in query optimization in deductive databases [8]. It 
is also useful in program debugging, where it allows verifying user assertions re- 
garding non-failure of predicates [12,13]. Finally, similar techniques can be used 
to detect the absence of errors or exceptions when running particular predicates. 

A practical non-failure analysis has been proposed by Debray et al. [9]. In a 
similar way to the example above, this approach relies on first inferring mode 
and type information, and then testing that the constraints in the clauses of the 
predicate are entailed by the types of the input arguments, which is called a 
covering test. Covering cannot be inferred by examining the constraints of each 
clause separately: it is necessary to collect them together and examine the be- 
havior of the predicate as a whole. Furthermore, non-failure of a given predicate 
depends on non-failure of other predicates being called and also possibly on the 
constraints in such predicates. 

While [9] proposed the basic ideas behind non-failure analysis, only a simple, 
monovariant algorithm was proposed for propagating the non-failure informa- 
tion. In our experience since that proposal, we have found a need to improve 
it in several ways. First, information propagation needs to be improved, which 
leads us to a fixpoint propagation algorithm. Furthermore, the analysis really 
needs to be multi-variant, which means that it should be able to infer separate 
non-failure (and covering) information for different call patterns for a given pred- 
icate in a program. This is illustrated by the following example which, although 
simple, captures the very common case where the same (library) procedure is 
called from a program (in different points) for different purposes: 

Example 1 Consider the (exported) predicate mv/3 (which uses the library 
predicate qsort/2), defined for the sake of discussion as follows: 

mv(A,B,C):- qsort(A,B), !, C = B. 
mv(A,B,C):- append (A, B, D) , qsort(D, C) . 

Assume the following entry assertion for mv/3: 

:- entry mv(A,B,C) : (list(A, num) , list(B, num) , var(C)). 

which means that the predicate mv(A,B,C) will be called with A and B bound 
to lists of numbers, and C a free variable. A multi- variant non-failure analysis 
would infer two call patterns for predicate qsort/2: 

1. The call pattern qsort(A,B): (list (A, num) , list (B , num) ) , for which 
the analysis infers that it can fail and is not covered, and 

2. the call pattern qsort (A, B) : (list(A,num) , var(B)), for which the ana- 
lysis infers that it will not fail and is covered. 
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This in turn allows the analysis to infer that the predicate mv/3 will not fail and 
is covered (for the call pattern expressed by the entry assertion). 

However, a monovariant analysis only considers one call pattern per predi- 
cate. In particular, for predicate qsort/2, the call pattern used is qsort(A,B) : 
(list (A,mim) , term(B))^ (which is the result of “collapsing” all call patterns 
which can appear in the program, so that precision is lost), for which it infers 
that qsort/2 can fail and is not covered. This causes the analysis to infer that 
the predicate mv/3 can fail (since the calls to qsort/2 in both clauses of predicate 
mv/3 are detected as failing) and is covered. □ 



In order to address the different shortcomings of [9] in this paper we start 
by casting the ideas behind non-failure and covering analysis as an abstract 
interpretation [5]. This then allows us to incorporate non-failure analysis into 
a (somewhat modified) standard, generic abstract interpretation engine. This 
has several advantages. First of all, the analysis is now based on a standard 
and well studied theoretical framework. But, most importantly, being able to 
take advantage of standard and well developed analysis engines allows us to 
obtain a simpler and more efficient implementation, with better propagation of 
information, performing an efficient fixpoint. The non- failure and covering anal- 
yses can be performed alongside other abstract interpretation based analyses, 
such as those for modes and types, in the same framework. Furthermore, the 
analysis that we obtain is multi-variant (on calls and successes) thus inferring 
separate non-failure (and covering) information for different call patterns for a 
given predicate in a program. Finally, the abstract domain for non-failure can 
be easily enhanced to define a domain for determinacy of predicates. 

Abstract Interpretation [5] is often proposed as a means for inferring prop- 
erties of programs at compile-time. It was shown by Bruynooghe [2], Jones 
and Sondergaard [15], Debray [7], and Mellish [17] that this technique can be 
extended to flow analysis of programs in logic programming languages, and sev- 
eral frameworks or particular analyses have evolved since (e.g. [16,20,21,22]). 
Abstract interpretation formalizes the relation between analysis and semantics, 
and, therefore, it is inherently semantics sensitive, different semantic definition 
styles yielding different approaches to program analysis. For logic programs we 
distinguish between two main approaches, namely bottom-up analysis and top- 
down analysis. We also distinguish between goal dependent and goal independent 
analyses. In this paper we use a goal dependent framework, since non-failure 
analysis is inherently goal dependent. In [3], Bruynooghe describes a framework 
for the goal-dependent, top-down abstract interpretation of logic programs. We 
use the PLAI/CiaoPP framework [12,13], which follows [3], but incorporates a 
number of optimizations and efficient fixpoint algorithms, described in [18,19,14]. 



term(B) means that argument B can be bound to any term. 
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2 Preliminaries 

We will denote C the universal set of constraints. We let 0[l be the constraint 
0 restricted to the variables of the syntactic object L. We denote constraint 
entailment by so that ci |= C 2 denotes that ci entails C 2 - 

An atom has the form p{ti, where p is a predicate symbol and the ti 

are terms. A literal is either an atom or a constraint. A goal is a finite sequence of 
literals. A rule is of the form H B where H, the head, is an atom and B, the 
body, is a possibly empty finite sequence of literals. A constraint logic program, 
or program, is a finite set of rules. The definition of an atom A in program P, 
defnp{A), is the set of variable renamings of rules in P such that each renaming 
has A as a head and has distinct new local (but not head) variables. 

The operational semantics of a program is in terms of its “derivations” which 
are sequences of reductions between “states”. A state {G I 9) consists of a goal 
G and a constraint store (or store for short) 6. A state {L :: G\ 9), where L is a 
literal and :: denotes concatenation of sequences, can be reduced as follows: 

1. If L is a constraint and 9 A L is satisfiable, it is reduced to {G \ 0 A L). 

2. If L is an atom, it is reduced to {B :: G\ 9) for some rule {L : -B) € defnp{L). 
assuming for simplicity that the underlying constraint solver is complete. We 
use S '^p S' to indicate that in program P a reduction can be applied to 
state S to obtain state S' . Also, S S' indicates that there is a sequence of 
reduction steps from state S to state S'. A derivation from state S for program 
P is a sequence of states Sq '^p Si '^p ... '^p Sn where Sq is S and there is a 
reduction from each Si to 5^+1. Given a non-empty derivation D, we denote by 
curr_goal(D) and curr_store{D) the first goal and the store in the last state of 
D, respectively. E.g., if D is the derivation Sq '^p Sn with = {g :: G\ 9) then 
curr_goal{D) = g and curr _store{D) = 9. A query is a pair {L,9) where L is a 
literal and 9 a store of an initial state {L\9). The set of all derivations from Q for 
P is denoted derivations{P, Q). We will denote sets of queries by Q. We extend 
derivations to Q as follows: derivations{P, Q) = [jQ^Qderivations{P,Q). 

The observational behavior of a program is given by its “answers” to queries. 
A finite derivation from a query (L, 9) for program P is finished if the last 
state in the derivation cannot be reduced. A finished derivation from a query 
(L,9) is successful if the last state is of the form {nil I 9'), where nil denotes 
the empty sequence. The constraint 0' I p is an answer to (L,9). We denote by 
answers{P, Q) the set of answers to query Q. A finished derivation is failed if the 
last state is not of the form {nil I 0). Note that derivations{P, Q) contains not 
only finished derivations but also all intermediate derivations. A query Q finitely 
fails in P if derivations{P, Q) is finite and contains no successful derivation. 

Abstract Interpretation. Abstract interpretation [5] is a technique for static pro- 
gram analysis in which execution of the program is simulated on an abstract 
domain {Da) which is simpler than the actual, concrete domain {D). For this 
study, we restrict to complete lattices over sets both for the concrete (2^,C) 
and abstract (Pa,E) domains. 
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Abstract values and sets of concrete values are related via a pair of monotonic 
mappings (0,7): abstraction a : 2^ ^ Da, and concretization 7 : Da — > 2^, 
such that Wx G 2^ : j{a{x)) A x and Vy S Da '■ a(7(j/)) = y. In general C is 

defined so that the operations of least upper bound (U) and greatest lower bound 

(n) mimic those of 2^ in a precise sense: 

VA, X' e Da ■■ A C A' 47 7(A) C 7(A') 

VAi, A 2 , A^ S Da '■ Ai U A 2 = A^ 47 7('^i) U 7('^2) = 7(^^) 

VAi, A2, X' G Da '■ Ai n A2 = A^ 47 7('^i) 7('^2) = 7(^^) 

Goal dependent abstract interpretation takes as input a program P, an ab- 
stract domain Da, and a description Qa of the possible initial queries to P, given 
as a set of abstract queries. An abstract query is a pair (L, A), where L is an atom 
(for one of the exported predicates) and A G Da describes the initial stores for 
L. A set Qa represents the set of queries "/(Qa), which is defined as "f(Qa) = 
{(L, 6) I (L,X) G QaX9 G 7(A)}. Such an abstract interpretation computes a set 
of triples Analysis(P, Qa, Da) = {{Lp, A°, A®) | p is a predicate of P}, where Lp 
is a (program) atom for predicate p. Note that, the analysis being multivariant 
(on calls), it may compute several tuples of the form (Lp, A'^, A®) for different call 
patterns (Lp,X‘^) of each predicate p (including different program atoms Lp). If 
p is detected to be dead code then A® = A® = A. As usual in abstract interpreta- 
tion, A denotes the abstract constraint such that 7(A) = 0, whereas T denotes 
the most general abstract constraint, i.e., 7(A) = D. 

3 The Abstract Interpretation Framework 

PLAI is an analysis system based on the abstract interpretation framework of 
Bruynooghe [3] with the optimizations described in [18]. The framework works 
on an abstraction of the (SLD) AND-OR trees of the execution of a program for 
given entry points. The abstract AND-OR graph makes it possible to provide in- 
formation at each program point, a feature which is crucial for many applications 
(such as, e.g., reordering, automatic parallelization, or garbage collection). 

Program points and abstract substitutions are related as follows. Consider a 
clause h:- pi, . . . ,p„. Let Xi and A^+i be the abstract substitutions to the left 
and right of the subgoal pi, 1 < i < n in this clause. Then Xi and A^+i are, 
respectively, the abstract call substitution and the abstract success substitution 
for the subgoal Pi. For this same clause, Ai is the abstract entry substitution and 
A„_|_i is the abstraet exit substitution. Entry and exit substitutions are denoted 
respectively Pentry and Pexit when projected on the variables of the clause head. 

Computing the success substitution from the call substitution is done as 
follows (see Figure 1(a)). Given a call substitution Xcaii for a subgoal p, let 
hi, ... , hm be the heads of clauses which unify with p. Compute the entry sub- 
stitutions pientry, ■ ■ ■ , Pmentry for these clauses. Compute their exit substitu- 
tions Plexit, ■ ■ ■ , P'tnexit as explained below. Compute the success substitutions 
Al success, ■ • • , Xm success from the corresponding exit substitutions. At this point, 
all different success substitutions can be considered for the rest of the analysis, 
or a single success substitution Asuccess for subgoal p computed by means of an 
aggregation operation for Alsuccess, • ■ • , Xirisuccess- This aggregator is usually the 
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Fig. 1. Illustration of the Top-Down Abstract Interpretation Process 



LUB (least upper bound) of the abstract domain. In the first case the analysis 
is multi-variant on successes, in the second case it is not. 

Computing the exit substitution from the entry substitution is straightfor- 
ward (see Figure 1(b)). Given a clause h:- and an entry substitu- 

tion (Gentry for the clause head /i, Ai is the call substitution for p\. This one 
is computed simply by adding to /Sentry an abstraction for the free variables in 
the clause. The success substitution A 2 for pi is computed as explained above 
(essentially, by repeating this same process for the clauses which match pi ) . Sim- 
ilarly, A 3 , ... , A„-)-i are computed. The exit substitution Bexu for this clause is 
precisely the projection onto h of An-i-i. 

If, from a different subgoal in the program, a different entry substitution is 
computed for an already analyzed clause, different call substitutions will appear 
(for Pi and possibly the other subgoals). These substitutions can be collapsed 
using the LUB operation, or a different node in the graph can be computed. In 
the latter solution, different nodes exist in the graph for each call substitution 
and subgoal, thus yielding an analysis which is multi-variant on calls. 

Note that the framework itself is domain independent. To instantiate it, a 
particular analysis needs to define an abstract domain and abstract unification, 
and the U relation, which in turn defines U (LUB). Abstract unification is divided 
into two in the framework, so that it is required to define: ( 1 ) how to compute the 
entry substitution for a clause C given a subgoal p (which unifies with the head 
of C) and its call substitution; and (2) how to compute the success substitution 
for a subgoal p given its call substitution and the exit substitution for a clause 
C whose head unifies with p. We formalize this with functions entry_to_exit and 
calLto_success in Figure 2. The domain dependent functions used there are: 

— call-to_entry{p(u),C, X) which gives an abstract substitution describing the 
effects on vars{C) of unifying p{u) with head{C) given an abstract substi- 
tution A describing u, 

— exit-to-Success{X,p{u),C, P) which gives an abstract substitution describ- 
ing u accordingly to P (which describes vars{head{C))) and the effects of 
unifying p(u) with head{C) under the abstract substitution A describing u, 

— extend{\, A') which extends abstract substitution A to incorporate the infor- 
mation in X' in a way that it is still consistent, 

— project An{v, A) which extends A so that it refers to all of the variables u, 

— project ^out{v, A) which restricts A to only the variables v. 
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entry .to .exit {C, f3entry) = 

Ai ■— project.in{vars{C), gentry)', 

For i := 1 to length(C) do 

Ai+i := call. to. success {qi{ui), Ai))-, 
return project.out(vars(head(C)), An+i)-, 

call. to. success {p{u),\caii) = 

A ;= project.out{u, Xcaii)', A' := _L; 

For each clause C which matches p{u) do 

I3exit ■= entry. to. exit [C, call. to. entry {p{u),C,\))-, 
A' := A' U exit.to.success{\,p{u),C, Pexit)-, 
od; 

return extend {Xcaii, X')\ 



Fig. 2. The Top-Down Framework 



In the presence of recursive predicates, analysis requires a fixpoint compu- 
tation. In [18,19] a fixpoint algorithm was proposed for the framework that 
localizes fixpoint computations to only the strongly connected components of 
(mutually) recursive predicates. Additionally, an initial approximation to the 
fixpoint is computed from the non-recursive clauses of the recursive predicate. 
Fixpoint convergence is accelerated by updating this value with the information 
from every clause analyzed in turn. The algorithm is (schematically) shown in 
Figure 3. For a complete description see [18,19]. 

4 Abstract Framework, Domain, and Operations for 
Non-failure Analysis 

In the non-failure analysis, the covering test is instrumental. In fact, covering can 
be seen as a notion that characterizes the fact that execution of a query will not 
finitely fail, i.e., if it has finished derivations then at least one is successful. Note 
that, as in [9], non-failure does not imply success: a predicate that is non- failing 
may nevertheless not produce an answer because it does not terminate. 

Definition 1 (Covering). Given computation state {g :: G\ 6) in the execution 
of program P, define the global answer constraint of goal g in store 9 as: 
c = V{ curr.store{D[) \ D[ G derivations{P, (g,9)) and is maximal } 

Let u denote the variables of g already constrained in 6, call them the input 
variables. We say that g is covered in 9 iff 9 ln\= c[u- 

It is not difficult to show that, in a pure language, where failure can only 
be caused by constraint store inconsistency, covering is a sufficient condition 
for non-failure. Indeed, if g is covered in 9, i.e., 9 [u\= c then one of the 
disjunctions in (the projection of) c is entailed. This corresponds to a (maximal) 
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callJ,o ^success ^recursive(j){u), Xcaii) = 

A := project.out{u, Xcaii)', X' := _L; 

For each non-recursive clause C which matches p{u) do 
fiexit := entry J,o^exit{C,callJ,o^entry{p{u),C,X))\ 

X' := X' U exit^to^success{X,p{u),C, Pexit)', 
od; 

A" := fixpoint{p{u) , A, A'); 
return extend {Xcaii, X”)\ 

fixpoint{p{u) , A, A') = 

A" := A'; 

For each recursive clause C which matches p(u) do 
Pexit := entry J,o^exit{C, callJ,o^entry{p{u),C,X))-, 

X" := A" U exit^tosuccess{X,p{u),C, fdexit)’, 
od; 

If X” = X' then return X” 
else return fixpoint{p{u), X, A”); 



Fig. 3. The Fixpoint Computation 



derivation of {g, 9), and this derivation cannot be failed, since, if it were, it would 
be inconsistent, and no inconsistent constraint can be entailed by a consistent 
one. Therefore, either such derivation is infinite, or, if finite, it is successful. 

If g is covered in 9 then {g, 9) does not finitely fail. 

A key issue in non-failure analysis will thus be how to approximate the current 
store and the global answer constraint so that covering can be effectively and 
accurately approximated. In [9] such an approximation is defined in the following 
terms: A goal is non-failing if there is a subset of clauses of the predicate which 
do not fail and which match the input types of the goal. This “matching” is 
the so-called covering test, and basically amounts to the analysis being able to 
gather, for each such clause, enough constraints on the input variables of the 
goal to be able to prove that, for each of the variables, any element in the 
corresponding type satisfies at least the constraint gathered for one clause. An 
analysis for non-failure thus needs to traverse the clauses of a predicate to check 
non-failure of the clause body goals, collect constraints that approximate the 
global answer constraint, and finally check that they cover the input types of 
the original goal. In the rest of this section, we show how to accommodate the 
abstract interpretation based framework of the previous section to perform these 
tasks, and define an abstract domain suitable for them. 



4.1 Abstract Domain 

The abstractions for non-failure analysis are made of four components. The first 
two are (abstractions of) constraints that represent the current store and the 
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global answer constraint for the current goal. This is the core part of the domain. 
The other two components carry the results of the covering test, specifying if the 
current constraint store covers the global answer constraint, and if this implies 
that the computation may fail or not. The covering and non- failure information 
is represented by values of the set B = {T,0,1,T}, where 0 and 1 are not 
comparable in the ordering. For covering, 0 is interpreted as “not covered” and 

1 as covered. For non-failure, 0 is interpreted as “not failing” and 1 as failing. 

Definition 2 (Abstract Domain). Let C°‘^ and be abstmet domains for 
C. The abstract domain for non-failure is the set 

T={{s,c,o,f) I sGC“^c€C“^oe6,/eS} 

The ordering in domain T is induced from that in B, so that (overloading iT): 
(si,Ci,Oi,/i) C (s2,C2,02,/2) iff fl E /2 

In an element (s, c, o, /) € J-, components s and c are abstractions a\ and 

02 of the constraint domain C. The usual approximations used (e.g., in [9]) are 
types (and modes) for s, and a finite set of (concrete) constraints for c. 

Definition 3 (Abstraction Function). The abstraction of a derivation D in 
the execution of program P, such that curr_store{D) = 9 and curr _goal{D) = g, 
and the input variables and global answer constraint of g in 9 are respectively u 
and c, is a{D) = (0“F , o, /), where: 

/•=/?: /aifed ^ f I i/ |s|=“ U 

^ (0 otherwise o ^ otherwise 

It is easy to show that such an abstraction is correct, provided that oi and 02 
are also correct abstractions, and that the corresponding abstract covering test 
(|=“) correctly approximates Definition 1. For oi we have already mentioned 
the use of type and mode information. One possibility for 02 is to use only 
those constraints appearing explicitly in the clause bodies of the predicate whose 
covering test is to be performed (the current goal g in the derivation). 

Example 2 Consider the following (contrived) predicates: 
p(X,Y,Z) X =< Y, q(X,Z) . 
q(X,Y) X =< Y. 

The global answer constraint for p(X,Y,Z) is X =< Y A X =< Z, but it can be 
approximated simply by X =< Y, the only constraint in the definition of p/3. □ 

One rationale for the above choice might be that collecting all constraints 
in derivations may not be possible during a compile-time analysis (since such 
constraints are only known during execution), or may lead to non-termination of 
the analysis. However, the first problem can be alleviated by proper abstractions 
of the tests (such as a depth-k abstraction, in a way similar to [6]), and the second 
problem only occurs for recursive predicates. Thus, the most simple solution to 
the termination problem is to avoid collecting constraints in recursive calls. ^ 

^ Note that this does not imply that recursive calls are simply ignored. They need to 
be considered to check that they are indeed non-failing, even though their global 
answer constraint is not computed. 
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Example 3 The global answer constraint for the predicate sorted/1 defined 
below includes a constraint for each two elements in the input list, the length of 
which is not in general known at compile-time. 

sortedC [] ) . 
sortedC [_] ) . 

sortedC [X,Y|L] ) :- X =< Y, sortedC [Y I L] ) . □ 

Our solution to this problem^ is to collect only constraints that refer literally 
to the predicate arguments in the program clause head, which also excludes in 
general (but not always) the constraints arising from recursive calls. 

Example 4 Consider again the predicate sorted/1 defined in the previous ex- 
ample. We collect constraints only for the clause head argument [X,Y I L] , which 
amounts to only one constraint: X =< Y (since the recursive call does not pro- 
vide constraints for the head arguments that appear literally in the program). 

Consider, on the other hand, predicate p/3 of Example 2. In this case the com- 
plete global answer constraint for p(X,Y,Z) will be collected: X =< Y A X =< Z, 
since the two single constraints can be “projected” onto the clause head. □ 

Note that such a solution yields an under-approximation of the global an- 
swer constraints. Given the use of type and mode information, which are in gen- 
eral over-approximations, we have that, for any element (s, c, o, /) S given 
current constraint store 9 and global answer constraint cu, s = 9°‘^ is an over- 
approximation of 0, and c = is an under-approximation of oj. In this situ- 
ation, it is not difficult to prove that 9°^^ is|=“ 1^2 correctly approximates 

covering: 9 U\= uj U- 



4.2 Abstract Operations 

Abstract values (s,c, o, /) S T are built during analysis in the following way: 
/ is carried along during the abstract computation by the abstract operations 
below, o is computed from the covering test, c is collected as explained above, 
and for s, type and mode analysis is performed. Thus, our analysis is in fact 
three-fold: it carries on mode, type, and non-failure analyses simultaneously. We 
focus now on the abstract operations for non-failure, given that those for types 
and modes are standard: 

— call-to_entry(j>{u),C, X) solves head unification p(u) = head{C), and checks 
that it is consistent with the c component of A. If it is not, it returns T, 
otherwise, the resulting abstraction. 

If p{u) G C, i.e., if it happens to be a constraint itself, then no clause C 
exists, and p{u) itself is added to the c component. In this case the following 
exit -to success function is not called. 

® However, we plan to investigate other solutions. In particular, the use of a depth-k 
abstraction seems to be a very promising one. 
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— exit-to-Success{X,p{u),C, P) adds the equations resulting from unification 
p{u) = head{C) to the c component of /3 and projects it onto vars{u). 

It is the projection performed here that gets rid of useless constraints, like in 
the case of Example 4. Constraints that cannot be projected onto the (goal) 
variables u are simply dropped in the analysis. 

~ A l±) A' adds abstraction A to the set A' if A is non-failing. 

— extend{X, A') performs the covering test for A' (a set of abstractions); if it is 
successful, the c component of A' is merged with that of A. 

This operation uses the covering algorithm described in [9] , which takes the 
global answer constraint c and a type assignment for the input variables 
appearing in c. Given a finite set of variables V, a type assignment over 
E is a mapping from E to a set of types. This is computed from the type 
information in the first component of A. Input variables are determined from 
the mode information in that same component. The global answer constraint 
is obtained as the disjunction of the c components of each abstraction in X'. 

4.3 Adapting the Analysis Framework 

The framework described in the previous section is not adequate for non-failure 
analysis. The main reason for this is that the aggregation function for the suc- 
cessive exit abstractions of the different clauses is not the TUB anymore. In non- 
failure analysis, the constraints for each clause need to be gathered together, 
and a covering test on the set of constraints needs to be performed. Another 
difference is that the covering test should only consider constraints from clauses 
that are not guaranteed to fail altogether;® therefore the aggregator must be 
able to discriminate abstract substitutions on this criterion. 

We have adapted the definition of the calLto_success function to reflect the 
aggregation operator. The adapted definition is shown in Figure 4. Note that, as 
a result of this, A' in the algorithm is not anymore an abstract substitution, but 
a set of them. This is input to extend, which is in charge of the covering test. 



callJ,o ^success (p{u),Xcaii) = 

X ;= project-Out{u, Xcaii)', A' := 0; 

For each clause C which matches p{u) do 

Pexit ■= entry J. 0 -exit [C, callJ,o^entry{p{u),C,X))\ 
X' := A' 1 +) exit-to-Success{X,p{u),C, Pexit)', 
od; 

return extend {Xcaii, X')\ 



Fig. 4. The Top-Down Framework for Non-Failure Analysis 



When fixpoint computation is required, adapting the framework is a bit more 
involved. Basically, since the aggregation operator is not LUB, fixpoint detection 



Note how this information could be used to improve the results of other analyses. 
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cannot be performed right after the success substitution has been computed. 
Normally, it is the LUB that is used for updating the successive approximations 
to the fixpoint value, and fixpoint detection works by simply comparing the 
initial and the final values for the success substitution. In non-failure analysis, 
the covering test must be performed first, and only after this one has been 
performed, the test for the fixpoint can be done. The resulting algorithm is 
shown in Figure 5. It is basically a simpler fixpoint iterator over the function 
calLto-Success abandoning the sophisticated fixpoint computation of Figure 3. 



call .to.success -recursive (p{u), Xcaii) = 
X := project -out {u, Xcaii)’, 
return fixpoint{p{u) , A, _L); 

fixpoint{p{u), X, A') = 

X" := call -to success (p{u), X)\ 

If X” = X' then return X" 
else return fixpoint{p{u) , A, A"); 



Fig. 5. The Fixpoint Computation for Non-Failure Analysis 



A Running Example We now illustrate our analysis by means of a detailed 
example on how it will proceed. Consider the program (fragment) below: 

qsort (As ,Bs) : - qsort (As ,Bs , [] ) . 

qsort( [X|L] ,R,R2) 

partition(L,X,Ll ,L2) , qsort (L2,R1 ,R2) , qsort (LI ,R, [X I Rl] ) . 
qsort ( [] ,R,R) . 

partition( [],_,[],[]). 

partition( [E|R] ,C, [ElLeftl] .Right) E < C, partition(R, C.Leftl, Right) . 
partition( [E|R] .C.Left, [ElRightl] ) :- E >= C, part it ion (R.C, Left .Right 1) . 

Let the abstract call pattern for atom qsort (As. Bs. [] ) be 
{{list{As, num), var{Bs)}, true^ 1, 0). 

Upon entering the first clause defining qsort/3, the result of call -to -entry 
(restricted to the head variables) is 

{{num{X), list{L, num),var{R), [](i?2)}, true, 1, 0) ^ 
plus, additionally, {var{Rl),var{Ll),var{L2)} for the free variables in the 
clause. Once projected, this gives the call pattern for the first literal in that 
clause: 

{{list{L, num) , num{X) , var(Ll) , var{L2)} , true, 1, 0). 

To be concise, we denote with [](A) that the type of A is that of the empty lists. 
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We omit the analysis of the partition predicate. After the fixpoint compu- 
tation for this predicate, however, we will have a set of three abstract elements 
corresponding to the abstraction of the three clauses. For brevity, we express such 
set as a single abstraction where it is the c component that is a set, instead.® 
Note that this is possible because all other components (types, modes, covering, 
non-failure) of the abstractions in the set are the same. Thus, we have: 

( { list{L, num) , num{X) , list{Ll, num), list{L2, num) }, 

{ L = 0 A LI = 0 A L2 = [], L = [L|_] A E < X A LI = [E\_], 

L = [E\.] A E >= X A L2 = [E\_] }, T,0 ). 

This is now extended (by abstract function extend) to the corresponding 
program point of the clause of qsort. First, the covering test is performed, and it 
succeeds, since Lst(L, num), num{X) covers indeed the global answer constraint 
projected onto the input variables: 

(L = [E\_] A (L < A V A >= X)) V L = []. 

Therefore, computation is still covered and non-failing. This, together with 
the projection of the c component onto the variables of the first clause of qsort, 
yields success abstraction for partition: 

( { num{X), list{L, num) , var{R) , var{Rl) , [](i?2), 

list{Ll,num),list{L2,num) }, true, 1,0 ) 

where the c component is still true since the projection onto the clause variables 
factors out the previously computed global answer constraint. Now, analysis will 
proceed into call qsort (L2,R1,R2) with 

{{list{L2, num), uar(i?l), [](i?2)}, true, 1, 0). 

Since this is basically the same call pattern that we started with, no new 
fixpoint computation is started in this case.® On the other hand, a new fixpoint 
computation is started for the second recursive call qsort (LI, R, [X|R1]) with 
{{list{Ll, num) , var{R) , num{X), list{Rl, num)}, true, 1, 0). 

This is a new call pattern for the qsort predicate, which initiates a new 
fixpoint computation. The fixpoint value obtained in this computation is the 
same abstraction, except for the type of R which on output is a list. Finally, 
exit _to success now lifts this result to the original goal qsort (As , Bs , [] ) giving: 
{{list{As, num), list{Bs, num)}. As = [_|_], 1, 0). 

The analysis of the non-recursive clause immediately gives: 

({D(^s)JK-Ss)}, As = [] ALs = DJ,0), 

and extend computes the covering test for the set of the above two abstractions 
with the initial input abstraction, in which the input types are list{As,num). 
Certainly, this type covers the (projected) global answer constraint As = [_|_] V 
As = []. Thus, the goal is still covered and non- failing. 

Finally, since the abstraction now computed is only the result of a first iter- 
ation of the fixpoint computation, a new iteration is started. The result in this 
case is the same, and fixpoint computation finishes with that very same result. 



“ This very same “trick” is used in the implementation. 

® Here, we save the reader from some more fixpoint iterations that will be taking place. 
However, the results are as indicated. 
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5 Implementation Results 

We have constructed a prototype implementation in (Ciao) Prolog by adapting 
the framework of the PLAI implementation and defining the abstract operations 
for non-failure analysis that we have described in this paper. Most of these ab- 
stract operations have been implemented by reusing code of the implementation 
in [9], such as for example, the covering algorithm. We have incorporated the 
prototype in the Ciao/CiaoPP multiparadigm programming system [12,13,4] and 
tested it on the benchmarks used in the non-failure analysis of Debray et al. [9] , 
plus some benchmarks exhibiting paradigmatic behaviours, plus a last group 
with those used in the cardinality analysis of Braem et al. [1] . These two anal- 
yses are the closest related previous work that we are aware of. Some relevant 
results of these tests for non-failure analysis are presented in Table 1 . Program 
lists the program names, N the number of predicates in the program, F and 
C are the number of non-failing predicates detected by the non-failure analysis 
in [9], and the cardinality analysis in [1], respectively. 

Note that our multi-variant analysis can infer several variants (call patterns) 
for the same predicate, where some of them may be non-failing (resp. covered) 
and the other ones can be failing (resp. not covered). For instance, in the case of 
the program Mv in Table 1 (also described in Example 1), which has 4 predicates 
(mv/3, qsort/2, partition/4 and append/3), the analysis infers one variant for 
mv/3, which is non-failing and covered, 2 variants for qsort/2 (one of them which 
is non-failing and covered, and the other one which is failing and not covered), 
one variant for partition/4, which is non-failing and covered, and 3 variants 
for append/3 (2 of them which are non-failing and covered, and the other one 
which is failing and not covered). For this reason, and in order to make the 
results comparable, column AF shows two figures (both corresponding to the 
analysis presented in this paper): the number of predicates such that all of their 
variants (call patterns) are detected as non-failing, and (between parenthesis) 
the number of predicates such that some of their variants are detected as 
non-failing (this second figure is omitted if it is equal to the first one). 

Similarly, ACov shows two figures (both corresponding to the analysis pre- 
sented in this paper): the number of predicates detected to cover all of their 
(calling) types (variants), and (between parenthesis), the number of predicates 
detected to cover some of their (calling) types. Cov is the number of predicates 
detected to cover their (calling) types by the analysis in [9] . 

Taf and are the total time (in milliseconds) required by the analysis 
presented in this paper and the analysis in [9] respectively (both of which include 
the time required to derive the modes and types). The timings were taken on 
a medium-loaded Pentium IV Xeon 2.0Ghz with two processors, 1Gb of RAM 
memory, running Red Hat Linux 8.0, and averaging several runs and eliminating 
the best and worst values. Giao version 1.9.111 and GiaoPP-1.0 were used. 

Analysis time averages (per predicate) are also provided in the last row of 
the table. From these numbers, it is clear that the new implementation based on 
the abstract interpretation engine is more efficient than the previous one. It is 
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Program 


N 


AF 


F 


C 


ACov 


Cov 


Taf 


Tf 


'^’af 

Tf 


Hanoi 


2 


2 


2 


N/A 


2 


2 


33 


242 


0.14 


Fib 


1 


1 


1 


N/A 


1 


1 


17 


22 


0.77 


Tak 


1 


1 


1 


N/A 


1 


1 


9 


11 


0.82 


Subs 


1 


1 


1 


N/A 


1 


1 


5 


33 


0.15 


Reverse 


2 


2 


2 


N/A 


2 


2 


17 


29 


0.59 


Mv 


4 


2 


(4) 


1 


N/A 


2 (4) 


2 


54 


102 


0.53 


Zebra 


6 


2 


1 


N/A 


5 (6) 


4 


1008 


1100 


0.92 


Family 


3 


3 


1 


N/A 


3 


2 


10 


18 


0.56 


Blocks 


7 


1 


(2) 


0 


N/A 


4(5) 


4 


30 


59 


0.51 


Reach 


2 


2 


0 


N/A 


2 


1 


19 


30 


0.63 


Bid 


20 


5 


(8) 


5 


N/A 


14 (17) 


14 


3089 


3369 


0.92 


Occur 


4 


1 


(3) 


1 


N/A 


1 (3) 


1 


69 


78 


0.88 


Plan 


16 


5 


(8) 


3 


0 


11 (13) 


10 


2626 


4128 


0.64 


Qsort 


3 


3 


3 


0 


3 


3 


29 


65 


0.45 


Qsort2 


5 


3 


3 


0 


3 


3 


33 


76 


0.43 


Queens 


5 


2 


(3) 


2 


0 


3 (4) 


3 


60 


74 


0.81 


Pg 


10 


2 


(3) 


2 


0 


6 (9) 


6 


412 


477 


0.86 


Mean 














38 (/p) 


58 Up) 


0.67 (/p) 



Table 1. Accuracy and efficiency of the non-failure analysis (times in mS). 



also more precise, as shown for example in the benchmarks Mv, Zebra, Family, 
Blocks, Reach, and Plan. 

6 Conclusions 

We have described a non-failure analysis based on abstract interpretation, which 
extends the previous proposal of Debray et al. Our analysis improves in preci- 
sion, and enjoys a clear theoretical setting, and a simpler implementation. Also, 
the implementation is more efficient. The abstract domain underlying the analy- 
sis can be easily modified to cater for a determinacy analysis. Such an analysis, 
provided with a depth-k abstraction, would be the abstract interpretation coun- 
terpart of determinacy analyses such as that of [6] . We are currently working on 
the verification of this proposition. 

The implemented analysis we have described in this paper is currently inte- 
grated in CiaoPP, and is being used for lower-bounds cost analysis, granularity 
control, and program debugging. Arguably, although our presentation covers 
strictly constraint logic programming, the technique could be easily applied to 
functional logic languages with similar results, as is indeed the case in the Ciao 
system, where the analysis presented works without modification for Ciao’s func- 
tional subset or for combinations of functions and predicates. 
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Abstract. Sharing among program variables is vital information when 
analyzing logic programs. This information is often expressed either as 
sets or as pairs of program variables that (may) share. That is, either 
as set-sharing or as pair-sharing. It has been recently argued that (a) 
set-sharing is interesting not as an observable property in itself, but as 
an encoding for accurate pair-sharing, and that (b) such an encoding is 
in fact redundant and can be significantly simplified without loss of pair- 
sharing accuracy. We show that this is not the case when set-sharing is 
combined with other kinds of information, such as the popular freeness. 



1 Introduction 

Program analysis is the process of inferring at compile-time inferring information 
about run-time properties of programs. In logic programs one of the most studied 
run-time properties is sharing among program variables. Two program variables 
share in a given run-time store if the terms to which they are bound have at 
least one run-time variable in common. A set of program variables share if they 
all have at least one run-time variable in common. The former kind of sharing 
is called pair-sharing while the latter is called set-sharing. Any of the two may 
be target observables of an analysis. 

The importance (and hence popularity) of sharing comes from two sources. 
First, sharing information is in itself vital for several applications such as ex- 
ploitation of independent AND-parallelism [18,5], occurs check reduction [26,27], 
and compile-time garbage collection [23]. And second, sharing can be used to 
accurately keep track of other interesting run-time properties such as freeness 
(a program variable is free in a run-time store if it is either unbound or bound 
to a run-time variable) . 

Sharing analysis has therefore raised an enormous amount of interest in our 
research community, with many different analysis domains being proposed in 
the literature (see e.g., [27,17,25,3,19]). Two of the best known sharing analysis 
domains are ASub defined by Spndergaard [27] and Sharing defined by Jacobs 
and Langen [17,18]. The main difference between these two domains is the way 
in which they represent sharing information: while ASub keeps track of pairs of 
program variables that possibly share. Sharing keeps track of sets of program 
variables that possibly share certain variable occurrences. 



Y. Kameyama and P.J. Stuckey (Eds.): FLOPS 2004, LNCS 2998, pp. 117—131, 2004. 
© Springer-Verlag Berlin Heidelberg 2004 



118 



Francisco Bueno and Maria Garcia de la Banda 



These differences have subtle consequences. On the one hand, the pair sharing 
encoding in ASub allows it to keep track of linear program variables (a program 
variable is linear in a run-time store if it is bound to a term which does not have 
multiple occurrences of the same run-time variable). Linearity information, in 
turn, allows ASub to improve the accuracy of the abstract sharing operations. On 
the other hand, the set sharing encoding in Sharing allows it to represent several 
other kinds of information (such as groundness and sharing dependencies) which 
also result in more accurate abstract operations. In fact, when combined with 
linearity. Sharing is strictly more accurate than ASub. In practice, this accuracy 
improvement has proved to be significant [7] . 

As a result. Sharing became the standard choice for sharing analysis, usu- 
ally combined with other kinds of information such as freeness or structural in- 
formation, even though its complexity can have significant impact on efficiency. 
However, the benefits of using set sharing for sharing analysis have been recently 
questioned (see [10,1,2]). As a paradigm of the case, we cite the title of a paper 
by Bagnara, Hill, and Zaffanella: “Set-Sharing is redundant for Pair-Sharing” 
[1,2]. In this paper, the authors state the following 

Assumption: The goal of sharing analysis for logic programs is to detect 
which pairs of variables are definitely independent (namely they cannot 
be bound to terms having one or more variables in common). 

As far as we know this assumption is true. In the literature we can find 
no reference to the “independence of a set of variables”. All the proposed 
applications of sharing analysis (compile-time optimizations, occur-check 
reduction and so on) are based on information about the independence 
of pairs of variables. 

Based on the above assumption, the authors focus on defining a simpler version of 
Sharing which is however as precise as far as pair-sharing is concerned. This new 
simpler domain, referred to in the future as SS'^, is obtained by eliminating from 
Sharing information which is considered “redundant” w.r.t. the pair-sharing 
property. This elimination allows further simplification of the abstract operations 
in SS^ which can significantly improve its efficiency. 

The popularity of the Sharing domain combined with the great accuracy 
and efficiency results obtained for SS^ (and the clarity with which the authors 
explained the intricacies of the Sharing domain), ensured the paper had a sig- 
nificant impact on the community, with many researchers now accepting that 
set-sharing is indeed redundant for pair-sharing (see, e.g., [20,8,22,21]). 

The aim of this paper is to prove that this is not always the case. In particular, 
we will show that: (1) There exist applications which use set-sharing analysis 
(combined with freeness) to infer properties other than sharing between pairs 
of variables; and (2) When combined with information capable of distinguishing 
among the different variable occurrences represented by Sharing, this domain 
can yield results not obtainable with SS^, including better pair-sharing. Such a 
combination is found in at least two common situations: when Sharing is used 
as a carrier for other analyses (such as freeness), and when the analysis process 
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is improved with extra information (such as in-lined knowledge of the semantics 
of some predicates, for example builtins). Possible approaches to combine SS^ 
with other kinds of information without losing accuracy are also suggested. 

We believe our insights will contribute to the better understanding of an 
abstract domain which, while being one of the most popular and more intensively 
studied abstract domains ever defined, remains somewhat misunderstood. 

2 Preliminaries 

Let us start by introducing our notation as well as the basics of the Sharing 
domain [17,18]. In doing this we will mainly follow the extremely clear summary 
presented in [1]. Given a set S, p{S) denotes the powerset of S, and pf{S) denotes 
the set of all the finite subsets of S. V denotes a denumerable set of variables. 
Var e p/(V) denotes a finite set of variables, called the variables of interest (e.g., 
the variables of a program). The set of variables in a syntactic object o is denoted 
vars{o). T\> is the set of first order terms over V. A substitution 0 is a mapping 
0 : V — *■ 7y, whose application to variable x is denoted by x9. Substitutions are 
denoted by the set of their bindings: 9 = {x ^ x9 \ x9 ^ x}. We define the 
image of a substitution 0 as the set img{9) \J{vars{x9) \ x G Var}. 

def 

The Sharing domain is formally defined as follows. Let SH = p{SG), where 

SG {S C Var | 5 0}. Each element S € SG is called a sharing set. We will 

write sharing sets as strings with the variables that belong to it, e.g., sharing set 
{a;, y, z} will be denoted xyz. A sharing set of size 2 is called a sharing pair. 

The function occ{9, v) obtains a sharing set that represents the occurrence of 
variable v through the variables of interest as per the substitution 0. 

occ{9,v) {a; G Var \ v G vars{x9)} 

The abstraction of a substitution 0 is obtained by computing all relevant 
sharing sets: ^(0) {occ(0, z;) | ?; e img{9)}. 

Abstract element sh G SH approximates substitution 0 iff a{9) C sh. Con- 
versely, the concretization of sh G SH is the set of all substitutions approximated 
by sh. Projection over a set V C Var is given by 

proj{sh, P) {5 n P I S' e sh[V]} 
where, for any syntactic object o and abstraction sh G SH, 
sh[o] {S' G sh I S' n vars{o) yf 0}. 

The pairwise (or binary) union of two abstractions is defined as: 

shi ttl sh2 {Si U S'2 I Si G shi, S 2 G Sh2}. 

The closure under (or star) union of an abstract element sh is defined as the 
least set sh* that satisfies: 

sh* = shU (Si U S 2 I Si, S 2 G sh*}. 

Abstract unification for a substitution 0 is given by extending to the set of 
bindings of 0 the following abstract unification operation for a binding: 
amgu{sh, x 1 -^ t) = {sh \ (sh[a:] U sh[t])) U (sh[a:]* l±) sh[t]*). 

The set-sharing lattice is thus given by the set 

SS { {sh, U)\shG SH, U C Var, VS G sh : S C [/} U (T, T} 
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which is a complete lattice ordered by <ss defined as follows. For elements 
{d, (s/ll, Ui), (s/i2, C/2)} C SS: 

-L <ss d 
d <ss T 

(s/ll, Ui) <ss {sh2, U2) iff Ui = U2 and s/ii C s/12. 

The lifting of U, proj, and amgu defined over SH to define the abstract 
operations U, Proj, and Amgu over SS is straightforward. 

Example 1. Let V ar = {x,y,z'\ be the set of variables of interest and consider 
the substitutions 61 = {x ^ f{u,u,v),y 1-^ g{u,v,w,o), z h(u)} and 62 = 
{x ^ u,y ^ u,z ^ 1 }. Then, s/ii = a{9i) = {xy, xyz, y}, where sharing set xyz 
represents the occurrence of variable uin x,y and z, sharing set xy represents the 
occurrence of variable zi in a; and y, and sharing set y represents the occurrence 
of variables w and o in y. Similarly, we have that s/12 = a{02) = {xy} where 
sharing set xy represents the occurrence of variable u in x and y. Let U = Var. 
We then have that (s/12, U) <ss {shi, U) and thus (s/ii, C/)U(s/i2, U) = (s/ii, U). 
Finally, let V = {x,y}, Proj{{shi,U),V) = {{xy,y},V). Note that the sharing 
set xy in the projected abstraction represents not only the occurrence of variable 
u but also that of z;. o 

3 Eliminating Redundancy from Sharing 

One of the main insights in [ 10 , 1 ] regarding the Sharing domain is the detection 
of sets which are redundant (and can thus be safely eliminated or not produced) 
as far as pair-sharing is concerned. Given an element s/i of SH, sharing set 
S € sh is redundant w.r.t. pair sharing if and only if all its sharing pairs can 
be extracted from other subsets of S which also appear in sh. Formally, let 

def 

pairs{S) = {xy \ x,y £ S, x ^ y}. Then, S is redundant iff 
pairs{S) = \^{pairs{T) \ T £ sh,T C S} 



Example 2. Consider the abstraction sh = {xy, xz, yz, xyz} defined over Var = 
{a;, y, z}. It is easy to see that set xyz £ sh is redundant w.r.t. pair sharing, o 

Based on this insight, a closure operator, p : SH SH, is defined in [ 1 ] to 
add to each sh £ SH the set of elements which are redundant for sh. Formally: 

p{sh) = {S £ SG \ Wx £ S : S £ sh[x]*}. 

This function is then used to define a new domain SS^ which is the quotient 
of SS w.r.t. the new equivalence relation induced by p: elements d\ and 1/2 are 
equivalent iff p{di) = p{d2). The authors prove that (a) the addition of redundant 
elements does not cause any precision loss as far as pair-sharing is concerned. 
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i.e., that SS^ is as good as SS at representing pair-sharing, and that (b) p 
is a congruence w.r.t. the abstract operations Amgu, U and Proj. Thus, they 
conclude that SS^ is as good as SS also for propagating pair-sharing through 
the analysis process. 

The above insight is used by [1] to perform two major changes to the Sharing 
domain. Firstly, redundant elements can be eliminated (although experimental 
results suggest that this is not always advantageous). And secondly, addition of 
redundant elements can be avoided by replacing the star union with the binary 
union operation without loss of accuracy. This is a very important change since it 
can have significant impact on efficiency by simplifying one of the most expensive 
abstract operations in Sharing. 

The results obtained in [1] are indeed interesting and can be very useful in 
some contexts. However, there are situations in which the lack of redundant sets 
can lead to loss of accuracy w.r.t. pair sharing, and even incorrect results if the 
full expressive power of Sharing is assumed to be still present in SS^. 

Example 3. Consider the abstractions sh\ = {x, y, z, xy, xz, yz} and s/i 2 = 
{x,y,z,xy,xz,yz,xyz} defined over V ar = {a;,y, z}, and note that p{shi) = 
s/i 2 , i.e., the sharing set xyz is redundant for s/i 2 . 

Consider the Prolog builtin x == y which succeeds if program variables x and 
y are bound at run-time to identical terms. A sophisticated implementation of 
the Sharing domain (such as that of [4]) could take advantage of this information 
and eliminate every single sharing set in which the program variables x and y 
appear but not together (since all variables which occur in x must also occur in 
y, and vice versa). Thus, correct and precise abstractions of a situation in which 
the builtin was successfully executed in stores represented by shi and sft. 2 , will 
become sh[ = {z,a;y} and s/i 2 = {z,xy,xyz}, respectively. However, it is easy 
to see that pairs{sh[) ^ pairs{sh' 2 ), since z is definitely independent of both x 
and y in sh'^ while it might still share with them in s/i^. o 

The above example shows that Sharing can make use of the information 
provided by other sources in order to improve the pair-sharing accuracy of its 
elements, while the same action might lead to incorrect results for elements of 
SS^ if redundant sharing sets had actually been eliminated from those elements. 
As we will see in the following sections, this can happen when using information 
coming not only from builtins, but also from other domains (such as freeness) 
which are usually combined with set-sharing. Furthermore, useful information 
other than sharing can be inferred from combinations of Sharing and other 
sources which are not possible with SS^ . 

4 When Redundant Sets Are No Longer Redundant 

The problem illustrated in the previous example is rooted in the always surpris- 
ing complexity of the information encoded by elements of SJI. As indicated by 
[1,2], elements of SR can encode definite groundness (e.g., x is ground), ground- 
ness dependencies (e.g., if x becomes ground then y is ground), and sharing 
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dependencies.^ However, as we will see in this section, these are only by-products 
of the main property represented by elements of SH: the different variable oc- 
currences shared by each set of program variables. 

The groimdness of variable x, and the sharing independence between vari- 
ables X and y (i.e., the fact that x and y are known not to share) can be expressed 
by an element sh G SH a,s follows: 

ground{x) iff VS” G sh : x ^ S 
indep{x, y) iff 'iS G sh xy % S 

where ground{x) represents the fact that variable x is ground in all substitutions 
abstracted by sh, and indep{x, y) represents the fact that variables x and y do 
not share in any substitution abstracted by sh G SH . 

Groimdness dependencies in sh G SH can be easily obtained from the above 
statements in the following way. Let us assume that x is known to be ground. 
We can then modify sh by enforcing \/S G sh ■. x ^ S to hold, i.e., by elim- 
inating every S G sh such that a; S S'. If we can then prove that the same 
statement holds for some other variable y, we would then know that the im- 
plication ground(x) — > ground{y) holds for sh. This simply illustrates the well 
known result that Sharing subsumes the groundness dependency domain Def. 
The same method can be used for obtaining other dependencies for elements sh 
of SH. The following were used in [5] for simplifying parallelization tests: 

1. ground{xi) A ... A ground{Xn) ground{y) if 

WSGsh: if y e S then {xi, . . . , a:„} n S 0 

2. ground{xi) A ... A ground{Xn) indep{y, z) if 

WS G sh : if {y, z} C S then {xi, . . . , Xn} n S 0 

3. indep{xi,yi) A ... A indep{xn, y-a) ground(z) if 

VS G sh : a z G S then 3j G {xj,yj} C S 

4. indep{xi,yi) A ... A indep{xn, y-a) indep{w, z) if 

VS G sh : if {w,z} C S then 3j G [l,n], {xj,yj} C S 

Let us now characterize in a similar way the (non-symmetrical) property 
covers{x,y) expressed by an element sh G SH as follows: 

covers{x, y) iff VS G sh : ii y G S then x G S 

where cover s{x, y) indicates that variable y shares all its variables with variable 
X and, therefore, every sharing set in which y appears must also contain x. We 
can now derive other sharing dependencies for any sh G SH, such as: 

5. covers{xi,yi) A ... A cover s{xn,yn) — *■ ground{z) if 

VS G sh : if z G S then 3j G [1, n], yy G S, xj ^ S 

6. covers{xi,yi) A ... A cover s{xn,yn) — *■ indep{w,z) if 

\/S G sh : if {w, z} C S then 3j G [1, n], yj G S, Xj ^ S 

® The fact that it also encodes independence (e.g., x does not share with y) was 
probably obviated because this is also encoded by pair-sharing. 
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7. covers(xi , yi) A . . . A cover s{xm Un) —>■ cover s{w, z) if 

WS G sh : if z G S,w ^ S then 3j G [1, n], yj G S, Xj ^ S 

It is important to note that while the expressions with only ground(x) and 
indep{x,y) elements can also hold for any element of this is not true for 
the expressions with coverage information. 

Example 4- Consider again the abstractions introduced by Example 3, shi = 
{x,y, z,xy,xz,yz} and s/i 2 = {x,y,z,xy,xz,yz,xyz} which are defined over 
Var = {x, y, z}. Let us assume that both abstractions belong to Sharing. While 
implication covers{x,y) A covers{y,x) — > indep{x, z) holds for shi, it does not 
hold for s/i 2 . If we now consider the SS^ domain, both abstractions would be 
represented by the element shi. Therefore, the implication should not hold for 
shi in o 

In order to understand why, consider the differences between the expressions 
ground{x) iff VS” G sh : x ^ S, and indep{x, y) iff VS" G sh : xy ^ S, and the 
expression cover s{x, y) iff VS” G sh : if y G S then x G S. While in the first two 
the sharing sets which violate the right hand side of the expressions would always 
include the redundant set (if any), those which violate the last expression would 
not. Thus, to assume coverage might result in the subset of a redundant set 
being eliminated without the redundant set itself being eliminated. In this way 
sharing sets which are considered redundant at some point, might become non 
redundant once coverage information is added and, therefore, their elimination 
(or non generation) can lead to incorrect information. For example, consider the 
substitution sh = {xyz,xy,xz,yz}. While the problematic sets for ground{x) 
and indep{x,y) in sh are xyz,xy,xz and xyz,xy, respectively, the only one for 
cover s{x,y) is yz. But once yz is removed from sh, xyz is no longer redundant: 
it is the only sharing set able (when x covers y) to represent the possible sharing 
between x and y. 

As a result, sharing sets initially redundant for pair-sharing can prove use- 
ful whenever combined with other sources of information (coming from builtins, 
other analysis domains, etc.) capable of distinguishing between the variable oc- 
currences represented by the redundant sharing sets and the variable occurrences 
represented by their subsets, so that, once the extra information is added, a shar- 
ing set previously identified as redundant will no longer be so. 

5 Combining Sharing with Preeness 

In this section we will use the popular combination of Sharing with freeness 
information to illustrate two points. First, that very common sources of infor- 
mation (such as freeness) can distinguish between variable occurrences, an ability 
which can be exploited in ways that can make a redundant set no longer redun- 
dant. Thus, it can be advantageous not to eliminate them. And second, that 
the goal of sharing analysis for logic programs is not only to detect which pairs 
of variables are definitely independent, but also to detect (or propagate) many 
other kinds of information. 
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In order to illustrate these points we will use the notion of active sharing sets 
[6]. A sharing set S' G s/i is said to be active for store c G 'y(sh) iff S G a{c). All 
sharing sets {Si, • • • , Sn} C sh are said to be active at the same time if there 
exists a store c G 7(sh) such that VI < i < n, Si G o;(c). If only the information 
in Sharing is taken into account, then all sharing sets in any sh G SH can be 
active at the same time. 

Example 5. Consider the set-sharing abstraction sh = {x,xy,yz} defined over 
Var = {x, y, z}. All sets in sh can be active at the same time since there exists 
a store, say 9 = {x = f{u,v),y = f{v,w),z = such that a{6) = sh. In 

particular, u is the variable represented by sharing set x, v is represented by xy, 
and w is represented by yz. o 

However, this is not always the case when considering information outside 
the scope of Sharing. In some cases, two or more sharing sets cannot be active 
at the same time since, thanks to some extra information, we can determine that 
these sharing sets must represent the same variable (s) occurrence. 

Example 6. Consider again the set-sharing abstraction sh = {x,xy,yz} defined 
over Var = {x,y,z}, and let us now assume y and z are known to be free 
variables. As pointed out in [6] , since each sharing set in an abstraction represents 
a different occurrence of one or more variables, no two sharing sets containing 
the same free variable can be active at the same time (the same variable cannot 
be a different occurrence). In our example, xy and yz cannot be active at the 
same time since there is no concrete store with both y and z free, such that both 
share a variable not shared with anyone else (sharing set yz) and y also shares 
a different variable with x (sharing set xy). o 

Knowing which sharing sets in abstraction sh can be active at the same 
time according to 17 is useful because we can use thois notion to divide sh into 
js/ii, • • • , shn} such that sh = shi U . . . U shn, Vi, 1 < i < n all sets in shi can 
be active at the same time, and ^3j, 1 < j < n : j ^ i, shj C shi. 

Example 7. Consider again the abstraction sh = (a;, xy, yz} defined over Var = 
{x,y,z}. If y and z are known to be free variables, sh can be divided into two 
different sets, {x,xy} and {x,yz}, whose sharing sets can all be active at the 
same time. The former represents the concrete stores in which x definitely shares 
a variable with y (which is actually known to be y itself), and x might also have 
some variable which is not shared with anyone else. The latter represents the 
stores in which the free variables y and z are aliased and x might have some 
variables which are not shared with anyone else, o 

Note that the different shi together with 17 describe disjoints sets of concrete 
stores. Furthermore, even though (IJ^ 'y{shi))nj{f2) is still equivalent to 7(5/1) n 
7(17) (which justifies the correctness of dividing sh into the different shi in the 
presence of 17), it is often the case that Ui7('S^j) C 7(5/1), as it happens in the 
above example. As a result, it is generally easier to understand the concretization 
of sh and 17 by means of the concretization of each shi and 17. Let us use this to 
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show how the direct-product domain [11] of Sharing and freeness can be used 
to improve pair-sharing. 

Example 8. Consider the abstraction sh = {xy^xz,yz,xyz} defined over pro- 
gram variables x,y and 2 ;. If we knew that x, y, and z are free we could divide 
sh into the sets shi = {xy}, sh .2 = {xz}, sh^ = {yz} and shi = {xyz}. Now, shi 
represents stores in which z is known to be ground, which is not true according 
to our freeness information. Thus, its sharing sets (xy) can be eliminated from 
sh. The same reasoning applies to s/i 2 and sh. 3 . Thus, sh can be simplified to 
{xyz} indicating that all variables definitely share (which of course also implies 
their definite pair-sharing dependencies) . Note that if the set xyz did not belong 
to the abstraction, the concretization of sh in the context of freeness would be 
empty (indicating a failure in the program), o 

The above example shows how the direct-product domain of SS^ and freeness 
might be incorrect if the full power of set-sharing is assumed to be still present in 
SS^. This occurs whenever a redundant set is known to contain a free variable, 
since it would then appear in an sh^ without one or more of its subsets. Thus, 
the set would no longer be redundant for shi. A simple solution would be to 
behave as if redundant sets containing free variables were present in the SS^ 
abstractions even if they do not appear explicitly in them. It would be easy 
to think that such solution does not lose accuracy w.r.t. pair sharing. This is, 
however, not true. 

Example 9. Consider the set-sharing abstraction sh = {xy, xz, yz} defined over 
V ar = {x,y, z}. If we knew that y and z were free, we could divide sh into 
the sets sh\ = {xy,xz} and s /12 = {yz}^ respectively representing the concrete 
stores in which x shares with y and z, which do not share among them, and those 
in which x does not share with anyone and y shares with z. Note that these two 
situations are mutually exclusive. This allow us to prove (among others) that: 
indep{y, z) iS ^indep{x,y) and indep{y , z) iS ^indep{x , z) . 

This is crucial pair-sharing information (e.g., for automatic AND-parallelization, 
as we will see in the next section). If the redundant set xyz could have been 
eliminated from sh, the above expression might not hold, since the variables 
might then be aliased to the same free variable, thus capturing also the case in 
which all of them are definitely dependent of each other, o 

Let us now show how combining Sharing and freeness information, as done 
for example in Sharing+Freeness [25], yields interesting kinds of information 
other than the sharing itself, information which is the goal of such analyses for 
several applications. 

Example 10. Consider again the set-sharing abstraction sh = {xy,xz,yz} de- 
fined over Var = {x,y,z}. As mentioned above, if we knew that y and z were 
free, we could divide sh into the sets sh\ = {xy,xz} and s /12 = {yz}. The 
concrete stores represented by these sets can in fact be described much more 
accurately than we did in the previous example: While sh\ represents stores in 
which X is bound to a term with two (and only two) non-aliased free variables 
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(y and z), s/12 represents those stores in which x is ground, and y and z are free 
aliased variables. As a result, we can be sure sh only represents stores in which 
X is bound to a non-variable term, o 

Definite information about non-variable bindings is used, for example, to 
determine whether dynamic scheduled goals waiting for a program variable to 
become non- variable can be woken up, as performed by [ 15 ]. However, such 
information cannot be obtained if redundant sets containing free variables are 
eliminated. 

Example 11 . Consider the set-sharing abstractions sh = {xy,xz^yz} above and 
sh' = sh\j{xyz} where y and 2 are known to be free, we could divide sh' into the 
sets shi = {xy, xz} and s/12 = {yz\ and s/13 = {xyz}. The first two are as above, 
while the third represents stores in which all x, y and z share the same variables 
(with x possibly being a free variable). Thus, sh' does not only represent stores 
in which x is bound to a non-variable term, o 

Definite knowledge about non- variable bindings is not the only kind of useful 
information that can be inferred from combining Sharing and freeness. The 
combination can also be used to detect new bindings added by some body literal. 

Example 12 . Consider again the set-sharing abstraction sh = {xy, xz, yz} where 
y and z are known to be free. Let us assume that sh is the abstract call for body 
literal p{x, y, z) (i.e., the abstraction at the program point right before executing 
the literal) and that sh' = {xy,xz,yz,xyz} is the abstract answer for p{x,y,z) 
(i.e., the abstraction at the program point right after executing the literal) with 
y and ^ still known to be free. The addition of sharing set xyz means that a new 
binding aliasing y and 2: might have been introduced by p{x,y,z). However, if 
the abstract answer is found to be identical to the call sh, we can be sure that 
none of the three program variables has been further instantiated (since they 
are still known to be free) nor any new aliasing introduced among them, o 

The above kind of information is used, for example, for detecting non-strict 
independence [6] as we will see in the next section. As shown in the above 
example, this information cannot be inferred if redundant sets might have been 
eliminated (or not produced). 



6 When Independence among Sets Is Relevant 

This section uses the well-known application of automatic parallelization within 
the independent AND-parallelism model [ 9 ] to illustrate how some applications 
(a) require independence among sets (as opposed to pairs) of variables, and (b) 
can benefit from combining Sharing with freeness information in ways which 
would not be possible with SS'^ . The relevance of this application comes from the 
fact that it is not only one of the best known applications of sharing information, 
but also the one for which the Sharing domain was developed. 

In the independent AND-parallelism model goals g\ and p2 in the sequence 
9i , 52 can be run in parallel in constraint store c if 92 is independent of g\ for 
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store c. In this context, independence refers to the conditions that the run-time 
behavior of these goals must satisfy in order to guarantee the correctness and 
efficiency of their parallelization w.r.t. their sequential execution. This can be 
expressed as follows: goal g 2 is independent of goal gi for store c iff the execution 
of 32 hr c has the same number of computation steps, cost, and answers as that 
of g 2 in any store c' obtained from executing gi in c. 

Note that the general independence condition introduced above is thus nei- 
ther symmetric nor established between pairs of variables, as assumed by [1,2]. 
However, this general notion of independence is indeed rarely used. Instead, suf- 
ficient (and thus simpler) conditions are generally used to ensure independence. 
These conditions can be divided into two main groups: a priori and a posteriori. 
A priori conditions can always be checked prior to the execution of the goals 
involved, while a posteriori conditions can be based on the actual behaviour of 
the goals to be run in parallel. 

A priori conditions are more popular even though they can be less accurate. 
The reasons are twofold. First, they can only be based on the characteristics 
of the store c and the variables belonging to the goals to be run in parallel. 
Thus, they are relatively simple. And second, they can be used as run-time 
tests without actually running the goals themselves. This is useful whenever 
the conditions cannot be proved correct at compile-time. Note that a priori 
conditions must be symmetric: goals g\ and g 2 are independent for c iff g\ is 
independent of g 2 for c and 52 is independent of gi for c. 

The most general a priori condition, called projection independence, was de- 
fined in [14] as follows: goals gi and g 2 are independent for c if for any variable 
X S vars{gi) D vars{g 2 ), x is uniquely defined by c (i.e., ground), and the con- 
straint obtained by conjoining the projection of c over vars{gi) and the projec- 
tion of c over vars{g 2 ) entails (i.e., logically implies) the constraint obtained by 
projecting c over vars{gi) U vars{g 2 )- 

Example 13. Consider the literals p{x),q{y),r{z) and constraint c = {x = y + 
z}. The projection of c over the sets of variables containing either one or two 
variables from {x,y,z} is the empty constraint true. Thus, we can ensure that 
every pair of literals, say p{x) and g{y), can run in parallel. However, no literal 
can run in parallel with the goal formed by the conjunction of the other two 
literals, e.g., p{x) cannot run in parallel with goal g{y),r{z), since the projection 
of c over {x, y, z} is c itself, which is indeed not entailed by true, o 

Therefore, as mentioned in both [24] and [13], in general projection indepen- 
dence does indeed rely on the independence of a pair of sets of variables. However, 
for the Herbrand case projection independence is equivalent to the better known 
a priori condition called strict independence, which was introduced in [9,12] and 
formally defined and proved correct in [16]. It states that goals g\ and 32 are 
strictly independent for substitution 9 iff vars{g\) do not share with vars{g 2 ) 
for 6, i.e., iff vars{gi6) r\vars{g29) = 0. It is easy to prove that this is equivalent 
to requiring that for every pair of variables xy, x € vars{g\), y € vars{g 2 ), x and 
y do not share. 
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Therefore, only for a priori conditions and the Herbrand domain, is paral- 
lelization based on the independence of pairs of variables. And even in this case, 
the Sharing domain is more powerful than SS^ when combined with other kinds 
of information. 

Example 14- Consider again the abstractions sh = {xy,xz,yz,xyz\ and sh' = 
{xy, xz, yz} defined over Var = {x, y, z}. Example 9 illustrated how the formula 
indep{y, z) iff ^indep{x, y) and indep{y, z) iff ~^indep{x, z) 
hold for sh' but not for sh when y and z are known to be free. 

Consider the automatic parallelization of sequential goal p(y) ,q(z) ,r(x) 
for the usual case of the a priori condition strict independence and the Herbrand 
domain. In the absence of any information regarding the state of the store oc- 
curring right before the sequential goal is executed, the compiler could rewrite 
the sequential goal into the following parallel goal (leftmost column) : 



( indep(y,z) -> 

( indep(x,y) -> 

( indep(x,z) -> 
p(y)&q(z)fcr (x) 

; p(y)&(q(z) ,r(x) ) 
) 

; (p(y)&q(z)) ,r(x) 

) 

; indep(x,z) -> 


( indep(y,z) -> 


( indep(y,z) -> 


(p(y)&q(z)) ,r(x) 


(p(y)&q(z)) ,r(x) 


’ 


; indep(x,z) -> 


p(y) , (q(z)fcr(x)) 


p(y) , (q(z)&r(x) ) 


p(y) , (q(z)&r(x) ) 


; p(y) ,q(z) ,r(x) 




; p(y) ,q(z) ,r (x) 


) 


) 


) 



where the operator & represents parallel execution of two goals, and the run-time 
test indep(x,y) succeeds if the two variables do not share at run-time. The 
middle and right columns represent the simplifications that can be performed to 
the parallel goal in the context of sh' and sh, respectively. This is because while 
test indep(x,y) is known to fail if indep(y,z) succeeds for both sh and sh' , 
test indep(x,z) is known to succeed if indep(y,z) fails for sh' but not for sh. 
Thus, indepCxjz) still needs to be tested at run-time with the resulting loss of 
efficiency, o 

The assumption is also incorrect when considering a posteriori conditions, 
even those associated to the Herbrand domain. In particular, strict independence 
has been generalised to several different [16] a posteriori notions of non-strict 
independence. These notions allow goals that share variables to run in parallel 
as long as the bindings established for those shared variables satisfy certain 
conditions. For example, one of the simpler notions only allows gi to instantiate 
a shared variable and does not allow any aliasing (of different shared variables) to 
be created during the execution of g\ that might affect goals to the right. Thus, 
for this notion, the conditions are established between the bindings introduced 
by the two goals over their respective set of variables, and cannot be expressed 
using only sharing between pairs of variables. 
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There has been at least one attempt [6] at inferring non-strict independence 
at compile-time using the abstract domain Sharing+Freeness. The inference 
is based on two conditions. The first ensures that (Cl) no shared variables are 
further instantiated by g\. This is done by requiring that (a) all shared variables 
share through variables known to be free in the abstract call of gi (all sharing 
sets in the abstract call containing shared variables also contain a free variable), 
and (b) all these variables must remain free in the abstract answer of gi (all 
such sharing sets still contain a free variable after the analysis of gi). This first 
condition can be detected in the SS^ domain since the existence of a free variable 
in every sharing pair ensures the existence of a free variable in the “redundant” 
sharing set. Thus, the absence of such sharing set is not a problem. 

This is not however the case for the second condition, which ensures that no 
aliasing is introduced among shared variables by requiring Cl and, additionally, 
that (C2) there is no introduction in the abstract answer of any sharing set 
resulting from the union of several sets such that none contain the same free 
variable, and at least two contain variables belonging to both goals. 

Example 15. Consider again the set-sharing abstraction sh = {xy, xz, yz} where 
y and 2 are known to be free. Let us assume that sh is the abstract call for body 
p{x,y, z), q{x,y, z) and that sh' = {xy,xz,yz,xyz} is the abstract answer for 
p{x, y, z) with y and z still known to be free. All sharing sets in sh containing 
variables from both literals contain a free variable which remains free in sh' . 
Thus, Cl is satisfied. However, there exists a set xyz in sh' which can be obtained 
by unioning at least two sets xy and xz in sh which contain variables from 
both literals and have no variable in common known to be free in sh. The 
appearance of such a set represents the possible aliasing of y and z by p(x, y, z). 
This appearance violates C2 and thus the goals cannot run in parallel. Note 
that if the abstract answer was found to be identical to sh (i.e., if the redundant 
set xyz was absent), we would have been able to ensure that none of the three 
program variables had been further instantiated nor any new aliasing introduced 
among them. Therefore, we could have ensured that g 2 is independent of gi 
for the stores represented by sh and the associated freeness information, thus 
allowing their parallel execution, o 

The above example illustrates the fact that an equivalent inference cannot be 
performed in the SS'’ domain augmented with freeness unless care is taken when 
considering redundant sharing sets which include program variables known to be 
free. This is because the inference strongly depends on distinguishing between 
the different bindings introduced during execution of the goals to be run in par- 
allel, and as a result, on distinguishing between the different shared variables 
represented by the abstractions in the domain. Thus, elimination of redundant 
sets can render the method incorrect. One possible solution is to always assume 
that redundant sets containing free variables are present when combining SS'’ 
with freeness information. However, as shown in Example 9, this might be im- 
precise. Another, more accurate solution, is to only eliminate redundant sets 
which do not contain variables known to be free. 
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7 Conclusion 

We have shown that the power of set-sharing does not come from representing 
sets of variables that share, but from representing different variable occurrences. 
As a result, eliminating from Sharing information which is considered “redun- 
dant” w.r.t. the pair-sharing property as performed in SS^ can have unexpected 
consequences. In particular, when Sharing is combined with some other kinds of 
information capable of distinguishing among variable occurrences in a way that 
can make a redundant set no longer redundant, it can yield results not obtainable 
with SS^, including better pair-sharing. Furthermore, there exist applications 
which use Sharing analysis (combined with freeness) to infer properties other 
than sharing between pairs of variables and which cannot be inferred if SS^ is 
used instead. We have proposed some possible solutions to this problem. 
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Abstract. This paper presents a backward sharing analysis for logic 
programs. The analysis computes pre-conditions for a query that guar- 
antee a given post-condition is satisfied after the query is successfully 
executed. The analysis uses a pair sharing domain and is capable of in- 
ferring pre-conditions that ensure the absence of sharing. This, in turn, 
has many applications in logic programming. The work is unique in that 
it demonstrates that backward analysis is applicable even for properties 
that are not closed under instantiation. 

Keywords: Abstract interpretation; Backward analysis; Pair-Sharing 



1 Introduction 

Sharing analysis is useful in specialising, optimising, compiling and parallelising 
logic programs and thus sharing analysis is an important topic of both abstract 
interpretation and logic programming. Sharing domains track possible sharing 
between program variables since optimisations and transformations can typically 
only be applied in the absence of sharing. Conventionally, sharing is traced in 
the direction of the control- flow in a query-directed fashion from an initial state. 
This paper considers the dual problem: the problem of inferring a set of initial 
states for which an optimisation or transformation is applicable. Specifically, the 
paper presents a novel backward sharing analysis that propagates information 
against the control-flow to infer pre-conditions on the variable sharing of a query. 
The pre-conditions are inferred from a given post-condition which encodes the 
sharing requirement. The analysis guarantees that if the inferred pre-condition 
holds for a query, then any successful computation will satisfy the post-condition, 
thereby ensuring that the optimisation or transformation is applicable. 

This paper presents a novel, backward sharing analysis that is realised with 
abstract interpretation [2]. It is constructed as a suite of abstract operations 
on the classic pair-sharing domain [1,14,24] which captures information about 
linearity and variable independence. These operations instantiate a backward 
analysis framework which, in turn, takes care of the algorithmic concerns asso- 
ciated with a flxpoint calculation. This paper focuses on the two key abstract 
operations: the backward abstract unification and the backward abstract com- 
position operations. The backward abstract unification operation computes a 
pre-condition for a given equation and its post-condition. The backward abstract 
composition operation calculates a pre-condition for a call from its post-condition 
and a description of its answer substitutions. The other abstract operations are 
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much simpler and are more or less straightforward to construct. These operations 
are omitted from the paper for brevity. 

The remainder of the paper is organised as follows. Section 2 introduces basic 
concepts used throughout the paper. Section 3 contains a brief description of the 
abstract interpretation framework within which the backward sharing analysis 
sits. Sections 4-6 introduce the abstract domain and the abstract operations. 
Section 7 reviews related work and section 8 concludes. Proofs are omitted due 
to space limitation. 

2 Preliminaries 

This section recalls some basic concepts in logic programming and abstract in- 
terpretation. The reader is referred to [17] and [2] for more detailed exposition. 

Let 27 be a set of function symbols, V a denumerable set of variables. We 
assume that 27 contains at least one function symbol of arity 0. Term denotes 
the set of terms that can be constructed from 27 and V. 

An equation is a formula of the form ti = t 2 with ti,t 2 G Term. The set of 
all equations is denoted as Eqn whereas the set of all substitutions is denoted 
Suh. Let dom{9) be the domain of a substitution 0, and V(o) the set of variables 
in the syntactic object o. Let Subfau = Sub U {fail}. Given e G Eqn, mgu : 
Eqn I— !■ Sub fail returns either a most general unifier for e if e is unifiable or fail 
otherwise. For brevity, let mgu{ti,t 2 ) = mguiti = 12 ). The function composition 
operation o is defined as f o g = Xx.f{g{x)). Denote the size of a term t by jtj 
and the number of elements in a set S by [/S’]. 

Let {C, ,U^ , be a complete lattice and S' C C. S' is a Moore 

family iff e S and si S2 S S for any si, S2 € S. Let {D, \—^) be a poset. 
A function 7 : Z7 1-^ C is a concretization function iff 7 is a monotone and 
7(11) is a Moore family. A concretization function from D to C induces a Galois 
connection between D and C [2]. The induced adjoint, called an abstraction 
function, is a{c) = r\^{d G D | c 7(d)}- 

3 Framework for Backward Analysis 

The backward sharing analysis is based on a novel abstract semantics [18]. The 
abstract semantics is sketched below so that the paper is self-contained. It is a 
(lower) approximation to a collecting semantics that maps a call p(x) and a set 
O of substitutions into a set E of substitutions such that, for any ^ G S', if d 
is a computed answer for f{p{x)) then 5 o f G 6>, i.e., {S}p(a:){6>} is a valid 
partial correctness formula. Note that S = 0 is a valid solution. In a more precise 
solution, E contains more substitutions without compromising correctness. The 
collecting semantics is defined on the concrete domain {p{Sub), C). It is defined 
in terms of a suite of concrete operations. The two most important operators 
are uf^^ : Eqn x p{Sub) i-^- p{Sub) and : p{Sub) x p{Sub) i-^- p{Sub) defined 

uf~\e,0) = {C G Sub I mgu{f,{e)) o ^ G 6>| 

T 0 = {w G Sub I Vi/7 G <F.(i/) ouj G 0)} 
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The concrete operation reverses unification. For a given equation e and a 
set 0 of success substitutions, it returns the set of those initial substitutions 
^ such that the unification of e under ^ results in a success substitution in 0. 
The concrete operation reverses the composition of one substitution with 
another. Given a set 0 of success substitutions and a set S' of computed answer 
substitutions, it calculates the set of those initial substitutions w such that the 
composition of any ip with uj obtains a success substitution in 0. 

The abstract semantics is parameterised by an abstract domain (Z, C) but 
actually operates on the disjunctive completion of (Z, C). Let S' C Z and define 
i ('S') = {zi G Z I Bz 2 G S.zi O Z 2 }. The set of order-ideals of Z, denoted p^{Z), 
is defined by p'^(Z) = {S C Z | S =| (S)}. Note that each order-ideal can be 
represented by the collection of its maximal elements. This representation of an 
order-ideal will be used in the sequel. The abstract semantics operates over (Z) 
to express pre-conditions which are disjunctive [18]. The semantics essentially 
computes a denotation for each call which maps a single post-condition (in Z) 
to a disjunction of pre-conditions (in pl(Z)). The abstract semantics is defined 
in terms of a suite of abstract operations - one for each concrete operation. 
Implementing these operations instantiates the abstract semantics to obtain a 
backward analysis. The two abstract operations that mimic uf~^ and are 
uf : Eqn x Z i-^ P^{Z) and : Z x Z i-^- p^{Z). The backward abstract 
unification operation uf computes a pre-condition for a given equation and 
its post-condition. The backward abstract composition operation calculates 
a pre-condition for an atom from its post-condition and a description of its 
answer substitutions. These abstract operations are obtained by inverting the 
corresponding abstract operations from a forward sharing analysis. Let 7 : Z i— > 
p{Sub) be a concretization function. Define 7 ^(K) = Uj/gy liv)- These abstract 
operations are required to satisfy their local safety requirements. 

(a) (e.z)) C uf~^(e,^(z)) for any e G Eqn and any z £ Z, and 

(b) j'^(zo-^z') C 7 (z) 0-1 7 (z') for any z, z' G Z. 

These requirements state that each abstract operation faithfully lower approxi- 
mates its corresponding concrete operation. 

The following three sections present the backward sharing analysis. The shar- 
ing domain captures information about linearity and dependencies between vari- 
ables of interest. The abstract operations are obtained by inverting abstract 
operations from a forward sharing analysis. 



4 Abstract Domain 

A term t is linear iff it does not contain multiple occurrences of any variable. Let 
the predicate linear(t) hold iff t is linear. Two terms s and t share a variable iff 
V(s) n V(t) 7 ^ 0. Two variables x and y share under a substitution 0 if 6{x) and 
0{y) share. The possible sharing and possible non-linearity of variables under a 
substitution 9 are represented as a symmetric relation tt C VIxVI [24] where VI 
is the set of variables in the program. Let PS be the set of symmetric relations 
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over VI . The abstract domain for sharing and linearity, dubbed pair sharing, is 
(P5,C,0, V7^,n, U) which is a complete lattice. A Galois connection between 
{PS,C) and {p{Sub),C) is obtained as follows [1]. 



a : p{Sub) PS 
7 : PS 1 -^ p{Sub) 



:r(0) = U S e VI 
flee [ 

7(7t) = [J{0 C Sub I a{0) C 7t} 



(x^yAV(0(x))nV(%)) ^0) 
V 

(x = y A -^linear {9 {x))) 



We will write (u ^ v) S tt to stand for {(u,v), (v,u)} C tt. Thus, the set 
{xi ^ yi, • • • ,x„ ^ y„} abbreviates U"^i{(xi, y^), (yi,Xi)}. If (u ^ v) e n then 
(u <-> v) is called a link in tt. We will also use u v to abbreviate (u <-> u) G tt 
and M <S- X to indicate {u = v\/ u v). Define A(g)F = X x Y UY x X where 
X X F is the Cartesian product of X and F. A 0 F is used to generate a link 
between each variable of X and each variable of F. For instance {x} ® {y, z} = 
{x <-> y,x <-> z}. Define = {y | (x <-*■ y) G tt}. The set Tr^, includes all the 

variables that share with x in tt. Note that x G if (x,x) S tt. As a further 
example, = {y, z} and tt^ = {x, y} where tt = {x <-> y, x z, y <-> y}. Define 
(px = UxGX (t>x where A C VI. 

The next stage in the design of the backward sharing analysis is to construct 
the abstract operations and argue their correctness. 



5 Abstract Operation uf 

The backward abstract unification operation uf computes a pre-condition for 
a given equation and a given post-condition. It is constructed by inverting a 
forward abstract unification operation given below. The following predicate x • 
Term x PS g-s- {true, false} will be used in the definition of the forward abstract 
unification operation: x(t, tt) = -^linear{t) V ((V(t))^ n tt yf 0). The predicate 
x{t, tt) holds if Oft) is non-linear for some 6 G 7(7t) [1]. We abbreviate x(t, tt) = 
true as x(t,7r). The forward abstract unification operation is derived from an 
operation given in [14]. 

uf{s = t,Tr) = 

r7T\ (V(s) (g) F/) ifV(t) = 0 

} TT \ (V(t) 0 VI) if V(s) = 0 

[tt U link{s, t, tt) U (x(t, tt) I> link{s, s, tt)) U {x{s, x) > link{t, t, tt)) otherwise 

where link{s,t,Tr) = {u u | x S V(s) Ax<S>uAx<S>yAyG V(t)} and 
> is defined B l> tt = {if B then tt else 0) . The forward abstract unification 
operation safely upper approximates the forward concrete unification operation 
uf [14] where 



uf{e, 0) = {mgu{9{e)) o 9 \ 9 G 0} 
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The following lemma justifies the construction of the backward abstract uni- 
fication operation by inverting the forward abstract unification operation. 

Lemma 1. If uf{s = t, 7 T) C if then 7(7t) C uf~^{s = I 

According to lemma 1, a pre-condition for an equation and a post-condition 
can be obtained as follows. The forward abstract unification operator is run on 
the equation s = t and each tt S PS . The pre-condition contains those tt such 
that uf{s = t, 7 r) C if which are also maximal. Therefore, the following is a 
correct specification for the backward abstract unification operation. 

uf (s = t^if) = {n \ uf{s = t,Tr) Q if A Vtt' G PS.{uf{s = t, tt') C if ^ tt tt')} 

Computing uf {s = t, if) via membership checking is however not feasible. 

_ _ „ n(n + l) 

Suppose that V7 contains n variables. The abstract domain then has 2 2 

elements. Running uf on all these elements is practically impossible even for a 
relatively small n, say 7. The remainder of this subsection gives a polynomial 
method for computing uf {s = t,if) starting with simple cases. Without loss 
of generality, we assume that s and t unify for otherwise, { VI^} is a valid pre- 
condition. 

Case V(t) = 0. The effect of the forward abstract unification of s = t in a pair 
sharing tt is to remove those links that are incident to variables in V(s). Let if be 
the result of this pruning process - the post-condition. Then the unique maximal 
pre-condition is given by '0U(V(s) 0 V7). So, uf ^{s = t,if) = {'0U(V(s)(g) VI)}. 

Example 1. Let if = {w x,x x} and VI = {w, x, y, z}. Then uf ^{f{x, y) = 
f{a,b),if) = {{w-^x,w^y,x-<^x,x^y,x^z,y^y,y^z}}. I 

Case V(s) = 0. By symmetry to the above, uf ^{s = t, if) = {ifU (V(f) ® VI)}. 
When V(f) = 0 and V(s) = 0, both cases apply and uf (s = t,if) = {if}. 

Case V(s) yf 0 A V(t) yf 0. By the definition of uf, we have tt C uf{s = t, tt) for 
any tt. Thus if uf{s = t,Tr) C if then TrCif, hence tt can be obtained by removing 
a symmetric subset r of if. The problem is how to find r. The forward abstract 
unification operation uf produces a link u v from s = t and a subset of the 
links in if. Such a subset of links justifies the presence of u v in uf{s = t, if); 
and is henceforth called a support set for u v. 

Example 2. Let if = [w x,x ^ y,x ^ z,y z}. We have x(x,V') = 
false and xif{y,z),if) = true. So, uf{x = f{y,z),if) =ifU link{x, f{y, z),if) U 
{true O link{x, x, if)) U {false O link{f{y, z), f{y, z),if)) = ifU link{x, f{y, z),if)U 
link{x,x,if). That {w y) € link{x, f{y, z),if) has two justifications: one is 
that {w-f-^x)£if,x£ V(s) and y G 'V{f{y, z)); the other is that {w <-> x) € if, 
X G V(a:), 2 ; G Y{f{y,z)) and {y ^ z) G if. The link {w ^ y) occurs in 
link{x,x,if) because {w x) Gif, x G V(a:), x G V(x) and (x <-> y) G if. Thus, 
there are three support sets for w ^ y. Si = {w x} and S 2 = {w ^ x,y ^ z} 
and S 3 = {w X , X ^ y} . I 
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In order to ensure that forward abstract unification cannot produce a link 
u V that is not in tp^ all the support sets for u v must be destroyed. A 
support set is destroyed if just one of its links is removed. Therefore, to prevent 
u V from being produced, it is necessary to remove a set of links that contains 
one link from each of its support sets. Such a set is called a frontier for u ^ v. 

Definition 1. Let h : PS i— *■ PS be monotonic and ip G PS and t Q p). If 
(u v) G h{tp) and (u v) ^ h{ip \ t), we say that (removal of) t (from 
tp) excludes u v from h{ip) and that t is a frontier for u v in h(ip). Let 
(p G PS . We say that t is a frontier for (p in h(ip) if, for each link u v G (p, t 
is a frontier for u v in h(ip). I 

The particular notions of frontier and exclusion that are required to define back- 
ward abstract unification are obtained by putting h = \(p.uf{s = t,(p). Another 
instance of these concepts appears in section 6. 

Example 3. There are four frontiers for the w y link of example 2. They 
are Fi = {w <-> x}, F 2 = {w <-> x,x <-> y}, F 3 = {w x,y ^ z} and 
F 4 = {w x,x y,y z}. Removing any Tj for 1 < i < 4 from ip will 
prevent w y from being produced. I 

The above example demonstrates that one frontier for a link may be con- 
tained in another. Removing one frontier from ip results in a pair sharing that is 
a superset of that obtained by removing another frontier from ip. Since the pre- 
condition that is the object of the computation contains maximal pair sharings, 
only minimal frontiers for the link should be removed. The following example 
shows that a link may have more than one minimal frontiers. 

Example 4- Let ip = {w x,y ^ z}. Then {w ^ z) G uf{x = g{y),ip). The 
link has one support set: {w ^ x,y z}. Two minimal frontiers for w z are 
{w x} and {y z} which are incomparable. I 

Some links have no frontiers at all. For example, let ip = tP. Then uf{x,y,ip) = 
{x <-*■ y}. This indicates that the post-condition ip is unsatisfiable. 

Definition 2. Let h : PS 1 — > PS be monotonic, ip G PS and LI C PS. II is a 
complete set of frontiers for a link u v (a set (p of links respectively) in h(ip) 
*/ 

(i) every t: G II is a frontier for u v (tp respectively) in h(ip); and 
(ii) every minimal frontier for u v (cp respectively) in h{ip) is in U. I 

Observe that a complete set of frontiers may contain a non-minimal frontier. 

5.1 Minimal Frontier Function 

By the definition of uf, a link (u v) ^ ip occurs in uf{s = t, ip) iff it occurs in 
link{s, t, 7t), or tt) I> link{s, s, tt) or y(s, tt) > link{t, t, tt). Thus, it is excluded 
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from uf{s = iff it is excluded from link{s^t,n), and tt) O link(s, s,tt) 
and x(s;7r) l> link(t,t,Tr). 

We first consider how to exclude a link u v from link{s,t^tjj). We rewrite 
the definition of link{s,t,tlj) into link{s,t,ip) = Ui=i where 

( 7 i(s, = {u ^ V \ u € V(s) A u G V(f)} 

0 - 2 ( 5 , = {u ^ V \ u G V(s) Av ^ V(f) A {t/jy n V(t)) yf 0} 

0 - 3 ( 5 , = {u ^ V \ u ^ V( 5 ) Av G V(f) A {tljy n V(s)) yf 0} 

0-4(5, = {u ^ V \ u ^ V( 5 ) Av ^ V(t) A {tpu n V(s)) yf 0 A {-ijjv n V(t)) yf 0} 



Observe that {u v) G link{s,t,tl;) iff (u u) G ai{s,t,tjj) for some 1 < * < 4. 
Note that {u ^ v) G ai{s,t,ip) implies (m <-> u) ^ aj{s,t,tp) for j yf i. The 
following computes the set of minimal frontiers for u u in link(s,t,'ijj). 

{ 0 if u G V(s) A u G V(t) 

{({^}®V(t))n^} if u G V(s) A u ^ V(t) 

{({M}®V(s))nV'} if M ^ V(s) A u G V(t) 

{({m}®V(s)) n ij), ({u}( 8 )V(t)) n if M ^ V(s) Av ^ V(t) 



Each element in mf v, 5, t, ip) excludes u ^ v from link{s, t, ip). The empty 

set in the first branch indicates that the presence of u <-> in a\{s,t^ip) is 
independent of ip and hence cannot be excluded. The second contains one frontier 
that consists of links between v and variables in V(t) . The third is dual to the 
second. The fourth returns a set of two minimal frontiers. One consists of links 
between v and variables in V(t); and the other consists of links between u and 
variables in V( 5 ). 



Lemma 2. mf s,t,ip) is a complete set of frontiers for u v in 

link{s,t,ip). I 

We now consider how to exclude u <-*■ u from {x{t,ip) \> link (s, s, ip)). By 
the definition of O, x(t, l> link{s, s,ip) = (if x(^: V') then link{s,s,ip) else 0 ). 

Thus, we can either make the condition x(L V') false or exclude u v from 

link{s, s,ip). The latter can be accomplished by removing from ip any element 
in mf s, s,ip). Note that x(^iV') = ~‘Hnear{t) V ((V(t))^ fl ■i/' yf 0). If 

-^linearit) holds then x(f,'0) cannot be falsified by removing any part of ip. In 
this case, we can exclude m <-> u from x(f , 'f’) l> link(s, 5, ip) only by removing an 
element in mf i.j^^f^{u,v,s,s^ip). Otherwise, linear(t) holds. We can alternatively 
choose to falsify ((V(t))^ Dip ^). This can be done by removing all the links 
in (V(t))^ Dip. Thus, each element in {linear(t) A ((V(t))^ Dip 0) [> {(V(f))^ n 
Ip}) U mf nnf^{u, V, 5 , 5, Ip) excludes u ^ v from x(L V') link{s, 5, ip). Excluding 
u V from (x( 5 , ip) l> link(t^ t, ip)) is symmetric. 



Lemma 3. {linear{t) A ((V(t))^ Dip ^0)\> {(V(t))^ 0 ip}) U mfny^i.{u, v, s, 5, ip) 
is a complete set of frontiers for u ^ v in x(f, ip) > link{s, s,ip). I 
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In order to exclude a link u v from ("0)11/12 ('*/') i it is necessary to exclude 
u ^ V from both /ii("0) and /i2("0)- This can be accomplished by removing from 
0 a frontier for m in /ii(0) and a frontier for w r; in /i2(0). The union of 
a frontier for u <-> z; in /ii(0) and a frontier for it <-> z; in /i 2 ( 0 ) is a frontier for 
zt z) in /ii(0)U/i2(0)- To this end, define if'tth? = mm({0U0 | 0 G 'f'A0 € ^}) 
where min{II) returns the set of the elements in U that are minimal with respect 
to C. The operation l±) is commutative and associative. 

Lemma 4. Let 0 G PS, /ii,/i 2 : PS i— > PS monotonic functions, and <d>i a 
complete set of frontiers for a link u v in hi(pp) for 1 < i < 2. Then T>i l±) T >2 
is a complete set of frontiers for u ^ v in /ii(0) U /i2(0)- i 

The function mf{u,v,s,t,'ijj) below returns the set of all minimal frontiers 
for zi ^ z> in zt/(s = 0 0). 

mf{u,v,s,t,fj) = 

mfiink{u,v,s,t,il;) 

l±) ((/mear(0 A ((V(/))2 n0 0 0) > {(V(/))^ n0}) U mf s, s,f))) 

1+) ((/mear(s) A ((V(s))^ n0 0 0) > {(V(t))2 n0}) U mf i^^^{u,v,t,t,ilj)) 

Note that non-minimal frontiers for a link are removed by the min operation 
employed in the l±) operation and that minimal frontiers for a link are computed 
without computing support sets for the link. 

Lemma 5. mf(u,v,s,t,'ijj) is a complete set of frontiers for u v in 
uf{s = t,'tp). I 



Example 5. Let 0 = {zc a;, z/ <-> z}. Then (w ^ z) G uf{x = g{y),4’)- 
The set mf{w,z,x,g{y),ip) of minimal frontiers for w ^ z in uf{x = g(z/),0) 
is computed as follows. We calculate linear{g{y)) = true and V{g{y)) fl 0 = 
{y ^ y} C\tp = ^- Thus, {linear{g{y)) A (V{g{y))'^ fl 0 0 0)) = false and hence 
{linear{g{y)) A (V{g{y))'^ fl 0 0 0)) [> {V(<;(z/))^ fl 0} = 0. We can also obtain 
{linear{x) A (V(a;)^ n 0 0 0)) > {V(a;)^ f! 0} = 0. Thus, mf{w,z,x,g{y),if) = 
mfunkiw, z, X, g(y), 0)l±)(0Um/;„i,(zz;, 2 , x, x, 0))l±)(0Um/;i„i,(zz;, z, g{y),g{y),f;)). 
We first calculate mf i^^^{w,z,x,g{y),il)) = {({zz;} 0 {a:}) 0 0,({z} 0 {y}) n 
0} = {{zc <-> x},{y <-> z}} since w ^ {a;} and z ^ {y}. We can also obtain 
mfunkiw,z,x,x,f}) = {{zc ^ x}} and mf z, g{y), g{y),f}) = {{y ^ z}}. 

So, mf{w,z,x,g{y),ip) = {{zc ^ x},{y ^ z}} l+l {{zc ^ x}} l±) {{y ^ z}} = 
{{zz; ^ x}, {y ^ z}}. I 

Suppose that two links zz <-> z; and zz' v' need be excluded from uf{s = 00). 
Removing from 0 a frontier for one link will exclude the link. Both links will 
be excluded if the union of a frontier for zz <-> z; and a frontier for zz' v' is 
removed from 0. A frontier for a set of n links is the union of n frontiers - one 
for each link. 
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Lemma 6. Let tp € PS , h : PS i— > PS a monotonic function, (pi € PS and 
Ili C PS for I < i < 2. If Ili is a complete set of frontiers for (pi in hfip) then 
III W II 2 is a complete set of frontiers for (pi U (p 2 in hfip). I 

The following lemma provides a constructive method for computing 
uf (s = t, ip) for the case V(s) yf 0 A V(t) yf 0. The forward abstract unifi- 
cation operation uf is first employed to compute ip’ = uf{s = t,ip). The set of 
minimal frontiers for tp'P'f’ is then computed. It is 

Each pair sharing in uf ^{s = t, ip) is obtained by removing one of these minimal 
frontiers from ip. 

Lemma 7. uf ^ {s = t,ip) = {pj \ t \ t £ l±)(„„„)g(y,.\^) m/(M, n, s, t, V’)} where 
iP' = uf{s = t,ip). I 

Lemmas 1 and 7 imply the correctness of uf . We now show that 
uf (s = t,ip) is polynomial in |s| -|- |t| -|- \ VI\. Operations U, 0 and ® are 
polynomial in |VT|. Let II he a, set of minimal frontiers and tci,tt 2 G II such 
that 7T2 yf TTi. Then tti contains at least one link that does not belong to 7T2. 
Thus, |7T| < I V7p because |7 Ti| < | V7p. So, l±) is polynomial in | V7|. The forward 
abstract unification -0' = uf{s = t,ip) is polynomial in |s| -I- |t| -I- 1 V7| [1]. All links 
It in ip'P'ip invoke mf(u, v, s, t, ip) with the same s, t and ip. Thus, V(s), V(t), 
linear(s) and linearif) can be computed with their results being memoised for 
use in computing mf{u, v, s, t, ip) for different links u v. This takes 0(|s| -I- |f|) 
time. Using the memoised results, mf n^f.{u,v, s,t,ip) is polynomial in | V7|; so 
is mf{u,v,s,t,tp). The computation i;, s, t, ■;/;) is polynomial 

in I V7| since it invokes mf and l±l for \ip' \ip\ < | V7p times and both mf and l±l 
are polynomial in | VI\. So, uf = t, ip) is polynomial in |s| + |t| + | V7|. 

6 Abstract Operation 

The operator o : PS x PS 1 -^ PS was originally proposed in [1] for composing an 
abstract initial substitution for an atom with an abstract answer substitution 
(for the atom and the abstract initial substitution) to obtain an abstract success 
substitution. It will inverted to obtain 0 ““^ and it is defined 

Tfo(p = {{u,v) \u vM 3x, y.{u ^xAx-^yAy^ t>)} 

Note that o is not commutative. The following result is Lemma 4.4 in [1]. 

Proposition 1 (Codish, Dams and Yardeni). Let a G 7(7t) and 9 € ^{(p)- 
Then a o 9 £ "/{noip). 

The following lemma justifies the construction tactic of inverting o to obtain 
0 “^ :PS X PS^ PS. 



Lemma 8. If iroip C pj then 'y{(p) C (7(7t) o ^ 'yi'ip)). 
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The above lemma implies that the following is a correct specification for 

7TO“^'0 = {(f) I (tTOc^ C -(/)) a y<j)' G PS .{{TTO(f> C ifj) ^ ((/)' ^5 <f>)) 

Again, we need to find a practical method for computing no~^'if). For any tt, (/> G 
PS, we have (f> C nocf). Thus, if irocf) C if) then (f> ip. K pair sharing in 7ro“^'0 
can be obtained by removing a set r of links from ip. The problem is thus again 
equivalent to finding r. The notion of a support set and that of a frontier carry 
over with Xp-firop) taking the place of Xp.uffs = t, p). 

Example 6. Let ip = {ic x,x y,y z} and tt = {a: <-> y}. Then TTop = 
IT U tp U {w y,w ^ z,x ^ x,x z,y ^ y{. There is one support set for 
w z: {w x,y z}. Two minimal frontiers are obtained from the support 
set: {u" x} and {y z}. Removing either of them excludes w z from notp. 



By expanding the definition of o, we have irop ) = tt U '0 U N tt) U (tt XI 
0) U (■i/) XI 7T XI ■!/)) where tti M 7T2 = {m r; | 3w.{u wAw S u)} is associative. 
The following function computes the set of minimal frontiers for m r; in noip. 

mftcfu, V, IT, tp) = {({m} 0 TTy) n 0} l±l {(7T„ (g) {w}) fl tp} 
l±l{({u} (g) 7T^„) ntp, ({ri} (g) 7Ty,„) r\tp} 

Some explanation is in order. Assume that (ti v) ^ tp and (u <-> ti) ^ tt. 
Consider how to exclude u ^ v from notp. The link u v belongs to 0 N tt if u 
is linked via tp to any variable that is linked to v via tt. Thus, it is necessary to 
remove all links in ({u} (g 7t„) n tp. Excluding u v from tt XI 0 is symmetric. 
Observe that consists of those variables that are linked to v via a link in tt 
followed by a link in tp. In order to exclude u v from 0 N tt XI 0, we either 
remove the set of all links from u to variables in or remove the set of all 
links from v to variables in . The former is ({u} g 7r^„ ) 0 tp and the latter is 
({r;} g 7T^„) n tp. Finally, the link m <-> z; is excluded from irotp if it is excluded 
from 0 XI TT, TT XI 0 and 0 X tt XI 0. 

Lemma 9. For any tt G PS , mftcfu, v, tt, tp) is a complete set of frontiers for 
It <->■ w m (0 X tt) U (tt X 0) U (0 X TT X 0) . I 

The following lemma provides a polynomial method for computing 7ro”^0. 
Together with lemma 8, it ensures the correctness of the abstract operation o~L 
If TT 2 0 then 7TO0 tp for any 0 G PS. In this case, the post-condition tp 
is unsatisfiable and hence the pre-condition is the empty set of pair sharings. 
Otherwise, the pre-condition consists of those pair sharings that are obtained by 
removing minimal frontiers for (irotp) \ tp. 

Lemma 10. For any tt, 0 G PS, 

^-_i , if'^%'P I 

^“\{0\r|rG {Tro)p)\)i)iTiftc{u,v,'K,tp)} otherwise 
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Example 7. Continuing with example 6, (iTotp) \ tp = {w y,w ^ z,x 
x,x z,y <-> y}. We have ir^ = ttz = 0, = {a;} and = 

= {y}- Thus, mftc(w, z,TT,ip) = {0} l±l {0} W {({w} 0 7rv,J n '0, {{z} 0 tt^^) n 
0} = {{w x},{y z}}. Omitting details, we obtain other sets of mini- 

mal frontiers: mftc{w,y,TT,tp) = {{w <-!■ a;}}, mftc{x, z,i:,ip) = {{y <-*■ z}} and 
77i/te(x, a;, 7T, 0) = mftc{y,y,Tr,'ip) = {{a: <-!■ j/}}. The set of minimal frontiers for 
(7TO0) \ 0 is w, 7T, 0) = mftc{w, y, 7T, 0) l+l mftc{w, Z, 7T, 0) l+l 

mftc{x, x, 7T, 0) l±) mftc{x, z, tt, tp) W 'mftc{y, y, tt, 0) = {{w a, a y, ?/ <-^ 2 ;}} = 
{0}. Thus, 7TO“^0 = {0}. I 

We now turn to the time complexity of 7ro“^0. Observe that 7to0 is polynomial 
in I V7| since M and U are polynomial in | V7|. Since W, O and n are polynomial in 
I V7|, mftc{u, a, tt, 0) is polynomial in | V7|. Thus, W(u^i,)G( 7 roV.)\V>”^i^c(-«, a, tt, 0) 
is polynomial in | VT|; so is 7ro“^0. Both uf (s = t, tp) and Tro~^tp are polynomial; 
in contrast the widely-used set-sharing analysis has a forward abstract unification 
operator that is exponential [13]. 



7 Related Work 

Though backward analysis has been a subject of intense research in functional 
programming [25,11,5], backward analysis has until very recently [9,15,19,22] 
been rarely applied in logic programming. One notable exception is the demand 
analysis of [4]. This analysis infers the degree of instantiation necessary to al- 
low the guards of a concurrent constraint program to reduce: it is local analysis 
that does not consider the possible suspension of body calls. The information it 
infers is useful in detecting (uni-modal) predicates which can be implemented 
with relatively straightforward suspension machinery. A more elaborate back- 
ward analysis for concurrent constraint programs is [6]. This demand analysis 
infers how much input is necessary for a procedure to generate a certain amount 
of output. This information is useful for adding synchronisation constraints to 
a procedure to delay execution and thereby increase grain size, and yet not 
introduce deadlock. 

Mazur, Janssens and Bruynooghe [21] present a kind of ad hoc backward 
analysis to derive reuse conditions from a goal-independent reuse analysis. The 
analysis propagates reuse information from a point where a structure is decom- 
posed in a clause to the point where the clause is invoked in its parent clause. 
This is similar in spirit to how demand is passed from a callee to a caller in 
the backward analysis described in this paper. However, the reuse analysis does 
not propagate information right-to-left across a clause, resulting in a less precise 
analysis. The above backward analyses are not specialisations of any framework 
for logic program analysis. In [22], a backward analysis is proposed to infer spe- 
cialisation conditions that decide whether a call to a predicate from a particular 
call site should invoke a specialised version of the predicate or an unoptimised 
version. Specifically, if these conditions are satisfied by an (abstract) call in a 
goal-dependent analysis then the call will possibly lead to valuable optimisations. 
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and therefore it should not be merged with calls that lead to a lower level of op- 
timisation. The specialisation conditions produced by the backward analysis are 
not sufficient conditions and need to be checked by an ensuing forward analysis. 
In contrast, the pre-conditions obtained by the backward sharing analysis are 
guaranteed to be sufficient and thus need not be checked by a forward analysis. 

In [15], the authors of the current paper present an abstract semantics for 
backward analysis of logic programs and specialise it to infer safe modes for 
queries which ensure that the groundness assertions are not violated using the 
groundness domain Pos [20] . The backward groundness analysis is also used in 
termination inference by [9]. A backward analysis using the abstract semantics 
is performed by first computing an upper approximation to the success set of the 
program and then a lower approximation to the set of programs states (substi- 
tutions) that will not violate any assertion. The key operation that propagates 
information backwards the control-flow is the (intuitionistic) logical implication 
operation. Thus, the abstract domain of the analysis is required to condense. 
This, however, is a strong requirement for any domain. The abstract domain 
for the backward sharing analysis does not condense. The approach advocated 
in this paper is to found backward analysis on a novel abstract semantics for 
backward analysis which relaxes this requirement. An analysis using the new ab- 
stract semantics is a greatest fixpoint computation whilst an analysis with the 
abstract semantics in [15] additionally computes a least fixpoint computation. 

The authors of this paper have also shown how backward analysis can be used 
to perform type inference [19]. Given type signatures for a collection of selected 
predicates such as builtin or library predicates, the analysis of [19] infers type 
signatures for other predicates such that the execution of any query satisfying 
the inferred type signatures will not violate the given type signatures. Thus, the 
backward type analysis generalises type checking in which the programmer man- 
ually specifies type signatures for all predicates that are checked for consistency 
by a type checker. The work of [19] is distinct from that reported in this paper. 
The property considered in [19] is closed under instantiation whilst that in this 
paper is not. 

Very recently, Gallagher [8] has proposed a program transformation as a tac- 
tic for realising backward analysis in terms of forward analysis. Assertions are 
realised with a meta-predicate d{G,P). The meta-predicate d(G,P) expresses 
the relationship between an initial goal G and a property P to be checked at 
some program point. The meta-predicate d{G,P) holds if there is a derivation 
starting from G leading to the program point with which P is associated. The 
transformed program defining the predicate d can be seen as a realisation of 
the resultants semantics [7]. Backward analysis is performed by examining the 
meaning of d, which can be approximated using a standard forward analysis, to 
deduce goals G that imply that the property P holds. This work is both promis- 
ing and intriguing because it finesses the requirement of calculating a greatest 
fixpoint. One interesting line of enquiry would be to compare the expressive 
power of transformation - the pre-conditions its infers - against those deduced 
via a be-spoke backward analysis framework [15,19]. 
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Giacobazzi [10] proposes a method for an abductive analysis of modular logic 
programs. From a specification of the success patterns of a predicate defined in 
one module which calls open predicates defined in other modules, the method 
derives a specification for the success patterns of the open predicates. In contrast, 
our method derives a specification for the call patterns of some (unspecified) 
predicates from a specification of the call patterns of other (specified) predicates. 

Pedreschi and Ruggieri [23] develop a calculus of weakest pre-conditions and 
weakest liberal pre-conditions, the latter of which is essentially a reformulation 
of Hoare’s logic. Weakest liberal pre-conditions are characterised as the greatest 
fixpoint of a co-continuous function on the space of interpretations. Our work 
takes these ideas forward to show how abstract interpretation can infer weakest 
liberal pre-conditions. 

Cousot and Cousot [3] explain how a backward collecting semantics can be 
used to precisely characterise states that arise in finite SLD-derivations. They 
present both a forward collecting semantics that records the descendant states 
that arise from a set of initial states and a backward collecting semantics that 
records those states which occur as ascendant states of the final states. By com- 
bining both semantics, they characterise the set of descendant states of the initial 
states which are also ascendant states of the final states of the transition system. 
This use of backward analysis is primarily as a device to improve the precision 
of a goal-dependent analysis. Our work is more radical in the sense that it shows 
how a bottom-up analysis performed in a backward fashion, can be used to char- 
acterise initial queries. Moreover it is used for lower approximation rather than 
upper approximation. 

Hughes and Launchbury [12] shows how and when to reverse an analysis 
based on abstract interpretation. Their work is concerned with analyses of func- 
tional programs. In fact, Hughes and Launchbury argue that ideally the direction 
of an analysis should be reversed without reference to the concrete semantics. 
Our work demonstrates that this can be accomplished in the analysis of logic 
programs. 

A systematic comparison of the relative precision of forward and backward 
abstract interpretation of logic programs is given in [16]. 

8 Conclusions 

A backward sharing analysis for logic programs has been presented. From a given 
post-condition for an atom, it derives a pre-condition. Any successful execution 
of the atom in any state satisfying the pre-condition ends in a state satisfying 
the post-condition. Abstract operations for the backward sharing analysis are 
constructed by inverting those for a forward sharing analysis. The work demon- 
strates that backward analysis is applicable for properties that are not closed 
under instantiation. 
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Abstract. Outermost-needed rewriting /narrowing is a sound and com- 
plete optimal demand-driven strategy for the class of inductively se- 
quential constructor systems. Its parallel extension, known as weakly, 
deals with non-inductively sequential constructor systems. Recently, re- 
hnements of (weakly) outermost-needed rewriting and narrowing have 
been obtained. These new strategies are called natural rewriting and 
natural narrowing, respectively, and incorporate a better treatment of 
demandedness. In this paper, we address the problem of how to imple- 
ment natural rewriting and narrowing efficiently by using a refinement of 
the notion of definitional tree, which we call matching definitional tree. 
We also show how to compile natural rewriting and narrowing to Prolog 
and provide some promising experimental results. 



1 Introduction 

A challenging problem in modern programming languages is the discovery of 
sound and complete evaluation strategies which are ‘optimal’ w.r.t. some effi- 
ciency criterion (typically the number of evaluation steps and the avoidance of 
infinite, failing or redundant derivations) and which are easily implementable. 

A sound and complete rewrite strategy for the class of inductively sequential 
constructor systems (CSs) is outermost-needed rewriting [3]. The extension to 
narrowing is called outermost-needed narrowing (or needed narrowing) [5]. Intu- 
itively, a left-linear CS is inductively sequential if there exists some branching 
selection structure that is inherent to the rules. The optimality properties as- 
sociated to inductively sequential CSs explain why outermost-needed narrowing 
has become useful in functional logic programming as the functional logic coun- 
terpart of Huet and Levy’s strongly needed reduction [13]. Weakly outermost- 
needed rewriting [4] is defined for non-inductively sequential CSs. Its extension 
to narrowing is called weakly outermost-needed narrowing [4] and is considered 
as the functional logic counterpart of Sekar and Ramakrishnan’s parallel needed 
reduction [15]. 

Whereas outermost-needed rewriting and narrowing are optimal w.r.t. to 
inductively sequential CSs, weakly outermost-needed rewriting and narrowing 

* Work partially supported by MCyT under grants TIC2001-2705-C03-01, HA2001- 
0059 and HU2001-0019. 
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do not work appropriately on non-inductively sequential CSs, as shown by the 
following example. 

Example 1. Consider Berry’s program [15] where T and F are constructor sym- 
bols and X is a variable: 

B(T,F,X) = T B(F,X,T) = T B(X,T,F) = T 

This CS is not inductively sequential since there is no branching selection struc- 
ture in the rules. However, although the CS is not inductively sequential, some 
terms can still be reduced sequentially, where ‘to be reduced sequentially’ is 
understood as the property of reducing only positions that are unavoidable (or 
“needed”) when attempting to obtain a normal form (see [13]). For instance, the 
term B(B(T,F,T) ,B(F,T,T) ,F) has a unique ‘optimal’ rewrite sequence which 
achieves its associated normal form T 

B(B(T,F,T) , B(F,T,T) ,F) ^ B(B(T,F,T) ,T,F) -> T 

However, weakly outermost-needed rewriting is not optimal since, besides the 
previous optimal sequence, the sequence 

B( B(T,F,T) ,B(F,T,T) ,F) ^ B (T , B (F ,T ,T) , F) B(T,T,F) T 

is also obtained. The reason is that weakly outermost-needed rewriting parti- 
tions the CS into the inductively sequential subsets TZi = {B(X,T,F) = T} and 
TZ 2 = {B(T,F,X) = T,B(F,X,T) = T} in such a way that the first step of the 
former (optimal) rewriting sequence is obtained w.r.t. subset TZ\ whereas the 
first (useless) step of the latter rewriting sequence is obtained w.r.t. subset 7^.2. 
Note that the problem also occurs in weakly outermost-needed narrowing. For 
instance, term B(X,B(F,T,T),F) has the optimal narrowing sequence 

B(X, B(F,T,T) ,F) B(X,T,F) T 

whereas weakly outermost-needed narrowing also produces the following non- 
optimal (due to the unnecessary substitution {X i— > T}) narrowing sequence 
B(X, B(F,T,T) ,F) B(T,T,F) '^id T 

On the other hand, outermost-needed rewriting and narrowing are optimal 
for inductively sequential CSs only when non-failing input terms are considered, 
i.e. terms which can be reduced or narrowed to a constructor head-normal form. 
Hence, some refinement is still possible for failing input terms. 

Modern (multiparadigm) programming languages apply computational strate- 
gies which are based on some notion of demandness of a position in a term by a 
rule (see [7]). Programs in these languages are commonly modeled by left-linear 
CSs, and computational strategies take advantage of this constructor condition 
(see [2,14]). 

Example 2. Consider the following TRS borrowed from [8] defining the symbol 
-F, which encodes the division function between natural numbers. 

0 -F s(N) =0 M - 0 = M 

s(M) -F s(N) = s((M-N)-Fs(N)) s(M) - s(N) = M-N 
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Consider the term t = 10!~0, which is a (non-constructor) head-normal form. 
Outermost-needed rewriting forces^ the reduction of the first argument and eval- 
uates 10!, which is useless. The reason is that outermost-needed rewriting uses 
a data structure called definitional tree which encodes the branching selection 
structure existing in the rules without testing whether the rules associated to 
each branch could ever be matched to the term or not. A similar problem oc- 
curs when narrowing the term X 4- 0, since variable X is instantiated to 0 or s. 
However, neither instantiation is really necessary. 

In [9], we proposed a solution to these two problems, namely the non-optimal 
evaluation for non-inductively sequential programs and the unnecessary evalu- 
ation in failing terms, which is based on a suitable extension of the demand- 
edness notion associated to weakly outermost-needed rewriting and narrowing. 
The new strategies are called natural rewriting and natural narrowing, respec- 
tively. Our strategies incorporate a better treatment of demandedness and enjoy 
good computational properties; in particular, we show how to use them for com- 
puting (head-)normal forms and we prove they are conservative w.r.t. (weakly) 
outermost-needed rewriting and (weakly) outermost-needed narrowing. More- 
over, we defined a new class of CSs called inductively sequential preserving where 
natural rewriting and narrowing preserve optimality for sequential parts of the 
program. This new class of CSs is based on the extension of the notion of induc- 
tive sequentiality from defined function symbols to terms and is larger than the 
class of inductively sequential CSs. 

In this paper, we address how to implement natural rewriting and natural 
narrowing efficiently. After some preliminaries in Section 2, in Section 3, we 
provide a generalization of the notion of definitional tree, which we call matching 
definitional tree. In Section 4, we present how to reproduce natural rewriting and 
natural narrowing by traversing matching definitional trees. In Section 5, we 
show that it is possible to implement natural rewriting and narrowing efficiently 
by compiling to Prolog. Finally, Section 6 presents our conclusions. 

2 Preliminaries 

We assume some familiarity with term rewriting (see [16] for missing definitions) 
and narrowing (see [10] for missing definitions). Let i? C A x A be a binary 
relation on a set A. We denote the reflexive closure of R by R^, its transitive 
closure by R~^, and its reflexive and transitive closure by R*. An element a G A 
is an i?-normal form, if there exists no b such that a R b. We say that b is an 
i?-normal form of a (written a R b), if 6 is an i?-normal form and a R*b. 

Throughout the paper, X denotes a countable set of variables {x, y, . . .} and 
T denotes a many-sorted signature, i.e. a set of function symbols {f,g, ...} 
grouped into sorts. We denote the set of terms built from T and X by T(lF, X). 
A fc-tuple ti, . . . ,tk of terms is written t. A term is said to be linear if it has 

^ Note that this behavior is independent of the fact that two possible definitional trees 
exist for symbol (see [9, Example 21]). 
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no multiple occurrences of a single variable. Let Subst{T{T,X)) denote the set 
of substitutions. We denote by id the “identity” substitution: id{x) = x for all 
X G X. Terms are ordered by the preorder < of “relative generality”, i.e. s < t 
if there exists cr s.t. tr(s) = t. Term t is a variant of s if t < s and s < t. A 
most general unifier imgu) of t, s is a unifier a such that for each unifier tr' of 
t, s there exists 9 such that a' = 9 o a. 

By Vos{t) we denote the set of positions of a term t. Given a set S C iF\JX, 
Voss{t) denotes positions in t where symbols in S occur. We denote the root 
position by A. Given positions p, q, we denote its concatenation as p.q. Positions 
are ordered by the standard prefix ordering <. The subterm at position p of t is 
denoted as t\p, and t[s]p is the term t with the subterm at position p replaced 
by s. The symbol labeling the root of t is denoted as root{t). 

A rewrite rule is an ordered pair (Z, r), written I — > r, with l,r G T(J-, X) and 
I ^ X. The left-hand side (Ihs) of the rule is I and the right-hand side (rhs) of 
the rule is r. A TRS is a pair TZ = (IF, R) where i? is a set of rewrite rules. L{TZ) 
denotes the set of Ihs’s of TZ. A TRS TZ is left-linear if for all I G L{TZ), I is a linear 
term. Given TZ = (IF, i?), we take T as the disjoint union IF = C 1+) I? of symbols 
c G C, called constructors and symbols f G T>, called defined functions, where 
T> = {root{l) I Z — > r e i?} and C = T — V. A pattern is a term /(Zi , . . . ,lk) 
where f G T> and k S T{C,X), for 1 < i < Zc. A TRS TZ = {C \iiT>,R) \s & 
constructor system (GS) if all Ihs’s are patterns. A term t G T(lF, X) rewrites to 
s (at position p), written t s (or just t — > s), if t\p = a(l) and s = t[a(r)]p, 

for some rule I r G R, p G T’os{t) and substitution a. The subterm <j(l) in 
t is called a redex. On the other hand, a term t G T{T,X) narrows to s (at 
position p with substitution cr), written t s (or just t s) if p is a 

non-variable position in t and a{t) s. A term t is a head-normal form (or 

root-stable) if it cannot be reduced to a redex. 

3 Matching Definitional Trees 

In order to efficiently implement natural rewriting and narrowing, we must in- 
tegrate the demandedness notion of natural rewriting and narrowing [9] into a 
statically built structure, as happens in weakly outermost-needed rewriting and 
narrowing which use definitional trees [3]. Thus, we define matching definitional 
trees. First, we recall the definition of a (generalized) definitional tree. 

Definition 1. [4] R is a generalized definitional tree, or gdt, with pattern tt iff 
one of the following cases holds: 

T = branch^K , o, Tj, . . . ,7^) where tt is a pattern, o is the occurrence (called in- 
ductive^ of a variable of tt, the sort of 7t|o has different constructors c\, . . . ,Ck 
for k > 0, and for all i in {1, . . . , k}, % is a gdt with pattern TT[ci(x)]o, where 
X are new distinct variables. 

T = leaf {tv, I r) where tv is a pattern and I r is a rule such that tv and I 
are variants. 

T = or(7i, . . . ,7fe) where k > I and each % is a gdt with pattern tv. 
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1-y N 

Z \ 

o-y|l s(M’)-i-IH 



El -y s(N’ ) 

/ \ 




O-hs(N’) s(M’) s(N’) 



0 -h s(N’) s(M’) -y s(N’) 
Definitional tree (6) 



Definitional tree (a) 



Fig. 1. The two possible definitional trees for the symbol 4 



B(X,|Y|,Z) 

i 

B(X.T,|Z1) 

i 

B(X,T,F) 



B(T,|Y|,Z) B(F,Y,|Z1) 
B(T,F,Z) B(F,Y,T) 



B(|3,Y,Z) 



\ 



Fig. 2. A partition of definitional trees for the symbol B 



A definitional tree is a generalized definitional tree without or-nodes. A par- 
allel definitional tree is a generalized definitional tree where only one or-node 
is allowed at the top of the tree. A defined symbol / is called inductively se- 
quential if there exists a definitional tree T with pattern f{xi, . . . ,Xk) (where 
Xi,...,Xk are different variables) whose leaves contain all and only the rules 
defining /. In this case, we say that T is a definitional tree for /, denoted as 
Tf. A left-linear CS TZ is inductively sequential if all its defined function symbols 
are inductively sequential. It is often convenient and simplifies understanding 
to provide a graphical representation of definitional trees as a tree of patterns 
where the inductive position in branch-nodes is surrounded by a box [3] . 

Example 3. The symbol -F in Example 2 is inductively sequential since there 
exists a definitional tree for pattern M 4- N. Figure 1 shows the two possible defi- 
nitional trees. 

When a function symbol is not inductively sequential, a parallel definitional 
tree, instead of a definitional tree, is obtained. The graphical representation of 
a parallel definitional tree corresponds to a set of definitional trees. 

Example f. The symbol B in Example 1 is non-inductively sequential, since a 
rule partition that splits its rules into subsets that are inductively sequential is 
necessary. Figure 2 shows a parallel definitional tree for symbol B. 

Definitional trees are used by (weakly) outermost-needed rewriting and nar- 
rowing as a finite state automaton to compute the demanded positions to be 
reduced (in line with [13,15]). However, definitional trees merge two different 
processes: the pattern matching process and the evaluation process through de- 
manded positions. These two processes coincide in the case of inductively sequen- 
tial and non-failing terms because the evaluation sequence corresponds to the 
pattern matching sequence, but do not coincide for non-inductively sequential 



152 Santiago Escobar 



0^1 

Z I \ 

0-^0 s(M’)-;-0 0-^s(N’) 

/ i i \ 

0 ^ s(N’) s(M’) s(N’) 0 s(N’) s (M’ ) -i- s (N’ ) 

Fig. 3. A matching definitional tree for symbol 



Svlil 



TrueVY False VY XV True fflV False 

TrueVY TrueVffl False VY False Vffl XV True KlVTrue True V False False V False 

i i / \ 

True V True False V True True V True False V True 



Fig. 4. A matching definitional tree for symbol V 



or failing terms where the pattern matching process may determine a (possi- 
bly better) evaluation sequence (as shown in Examples 1 and 2). We define a 
matching definitional tree as a definitional tree where more than one inductive 
position is allowed for branch-nodes. 



Definition 2. T is a matching definitional tree, or mdt, with pattern tt iff one 
of the following cases holds: 

T = branch{'K, {ox,Tf , . . . , "7^^ ), . . . , (o„, Tf- , . . . , T^^)) where n is a pattern, 
oi,...,On with n > 0 are occurrences of variables of tt, the sort o/ 7 t|o. 
has different constructors c\,. . . ,c\. for i in {1, ... ,n} and ki > 0, and for 
all j in {l,...,ki}, Tf is a mdt with pattern 7r[c*(x)]oi, where x are new 
distinct variables. 

T = leaf{TT, I —>■ r) where tt is a pattern and I r is a rule such that I < tt. 

T = or(fTi, . . . ,7fe) where k > \ and each % is a mdt with pattern tt. 

For each defined symbol / in a left-linear CS, we can build a matching definitional 
tree T with pattern f{x\, . . . , Xk) (where xi, . . . ,Xk are different variables) whose 
leaves contain all and only the rules defining /. The graphical representation is 
similar to that of a definitional tree with the difference that it is possible to have 
more than one inductive position in branch-nodes and or-nodes are identified as 
nodes which do not have any boxed position. We denote branch-nodes that have 
only one position o as simply branch{TT,o,Ti, . . . ,7fc) (as happens in gdt’s). 
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ll< N 

Z \ 

0 ^ N s(M’) ^ H 

/ \ 

s(M’) ^ 0 s(M’) ^ s(N’) 

Fig. 5. The definitional tree and matching definitional tree for symbol ^ 



Example 5. Figure 3 represents the following^ matching definitional tree which 
could be associated to the symbol -h in Example 2: 

branch{n N, (1, branch{0 N, 2, leaf{0 s(N’ ))), 

branch{s('H ’ ) N, 2, leaf{s(K’ ) s (N’ ))), 

(2, branch{}\ s (N’ ), 1, leaf {0 s (N’ )), 

leaf{s(n’) s(NO)))) 



Example 6. Consider the following parallel-or TRS from [9]: 

True V X = True X V True = True False V X = X 
Figure 4 denotes the following mdt associated to the symbol V: 

branch{X V Y, (1, or{leaf{lrne V Y), 

branchilme V Y, 2, leaf (True V True))), 
or{leaf (False V Y), 

branch(False V Y, 2, leaf (False V True)))), 

(2, or(leaf(X V True), 

branch(X V True, 1, leaf (True V True), 

leaf (False V True))), 
branch(X V False, 1, Zeo/(True V False), 

Zeo/(False V False)))), 

Note that every definitional tree is also a matching definitional tree. 

Example 7. Consider the following defined symbol which is an example com- 
monly used to refer to definitional trees [5] : 

0 ^ N = True s(M) ^ 0 = False s(M) < s(N) = M ^ N 

Since this symbol has only one possible definitional tree (instead of symbol -F 
of Example 2 which has two), the associated definitional tree of Figure 5 corre- 
sponds to its matching definitional tree: 

branch(V[ ^ N, 1, leaf(0 ^ N), 

branch(s(K’) < N, 2, /ea/(s (M’ ) < 0), /ea/(s(M’ ) < s(N’)))) 



In this paper, we often omit the rule in a leaf node for simplicity. 



2 
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4 Evaluation through Matching Definitional Trees 



In inductively sequential left-linear CSs, the order of evaluation is determined 
by a mapping ip which implements the strategy by traversing definitional trees 
as a finite state automaton, in order to compute the demanded positions to be 
reduced [3]. 

In order to implement natural rewriting efficiently, we define a mapping mt 
which traverses a matching definitional tree as a pattern matching finite state 
automaton and returns the set of demanded positions associated to a concrete 
branch-node if all its inductive positions are rooted by non-constructor symbols. 



Definition 3. The function mt takes two arguments: an operation-rooted term, 
t, and a mdt, T , such that pattern(T) and t unify. The function mt yields a set 
of tuples of the form (p, R), where p is a position oft, and R is a rule I —f r of 
TZ. The function mt is defined as follows: 



mt(t, T) 



{A: 


,R) 


ifT 


= leaf{n,R); 






(P: 


R) 


ifT 


= or(Ti , . . . , Tfc) and {p, R) G mt(t, %) for some i, 


) 


(P: 


R) 


zfT 


= branch{jT, {oi,Tf , . . . ,Tf), . 


(n 


)) 






root(t\o.) G C for some 1 < i < n 


paUern{Tj) < t 








fo' 


r some 1 < j < ki, and (p, R) f 


E mt(t,7'*); 




(o^ 


.p, R) 


ifT 


= branch{TT, (oi , . . . , , . 




)) 



root{t\oi) € T> for all 1 < i < n, 

and (p,R) G mi{t\oi,%oot(t\o^)) for some 1 <i <n. 



Example 8. Consider the TRS and term t = 10! 4-0 in Example 2 together 
with the matching definitional tree 7E in Example 5 (and Figure 3). We have 
mt(t,71) = 0 since the two inductive positions of the topmost branch-node 
are not operation-rooted and no subtree can be selected: the two first subtrees 
because position 1 is operation-rooted and the third one because it needs an s 
constructor symbol at position 2, instead of 0. 



Outermost-needed narrowing is determined by a mapping A which imple- 
ments the strategy by traversing definitional trees [5]. Here, we define a mapping 
mnt which extends mt to narrowing as A extends ip. 



Definition 4. The function mnt takes two arguments: an operation-rooted term, 
t, and a mdt, T , such that pattern(fT) and t unify. The function mnt yields a 
set of triples of the form {p, R, a), where p is a position oft, R is a rule I —>■ r 
of TZ, and a is a unifier of pattern(fT) and t. The function mnt is defined as 
follows: 
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Bcm, 13,11) 



BCT.H.Z) B(F,Y,[Z]) B(X,T,H) B(H,F,Z) B(E,Y,T) B(X,[3,F) 

bct,t,H)b(t,f,z)b(f,y,t)b(f,[3,f)b(1S,t,t)b(x,t,f)b(t,f,z)b(f,f,[1)b(t,B],t)b(f,y,t)bcx,t,f)b(11,f,f) 
i if if i 

B(T,T,F) B(F,T,F) B(F,T,T) B(F,F,T) B(T,F,T) B(T,F,F) 



Fig. 6. A matching definitional tree for symbol B 



\A,R,a) if T=leaf{TT, R) and a = mgu{TT,t); 



mnt(t, T) 9 < 



(p,R,a) if T=or(Ti, . . . ,Tk) and (p, R, a) Gmnt{t,Ti) for some i; 

(p, R,a) if T= branch{-K, , . . . , . . . 

root{t\o^) G C for some 1 < i < n, pattern(ffj) < t 
for some 1 < j < h, and (p,R,a) € mnt{t,Tj); 

{o^.p, R, a) if T=branch{jr, (oi, T/, . . . , ), . . . , (o„, 

root(t\oi) ^ C for all 1 < i < n, root(t\oi) € V for 
some l<i<n, and (p,R,a) G mx\i{t\oi,T^oot(t\o))i 

(p, R, Boa) if T=branch{n, {oi,Tf , . . . . . . , , . . . 

root{t\oi) ^ C for all 1 < i < n, t\oi G X for some 
1 < i < n, 9 = mgu{t,pattern{TJ)) for some 
1 < j < ki, and (p,R,a) G mnt(0(t), 7^*). 



Example 9. Consider the TRS and term t = B(X,B(F,T,T) ,F) in Example 1. 
The mdt % is depicted in Figure 6. Then, although the rule B(F,X,T)=T ap- 
pears four times in %, we have mnt(t,7^) = {(2, B(F,X,T)=T, id)} since only the 
rightmost branch associated to inductive position 3 and constructor symbol F at 
the third argument is selected. 

It is worth noting that Definitions 3 and 4 boil down to outermost-needed 
rewriting [3] and outermost-needed narrowing [5], respectively, when definitional 
trees are considered (i.e. matching definitional trees where each branch-node has 
only one inductive position and no or-node is included) . 

In the following, we prove that natural rewriting and natural narrowing can 
be appropriately reproduced by using Definitions 3 and 4 and matching defini- 
tional trees. 

4.1 Natural Rewriting 

In [9], we provided a suitable refinement of the demandedness notion associated 
to outermost-needed rewriting which uses some basic definitions from the on- 
demand evaluation presented in [1], dealing with syntactic strategy annotations. 

Definition 5. [1] Given t,l G T(JF, A), we let Vos^{t,l) = minimal<{{p G 
Vosft) \AVosjr{l) I root{l\p) yf root{t\p)}). 
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Definition 6. [1] We define the set of demanded positions of term t € T(fF,X) 
w.r.t. I G T{!F^X) (a Ihs of a rule defining root(t)), i.e. the set of (positions of) 
maximal disagreeing subterms as: 



Example 1 0 . Consider again Example 1 and terms t = B(B(T,F,T) ,B(F,T,T) ,F) 
and t' = B(X,B(F,T,T) ,F). We have (identically for t'): DPb(t,f,x) (0 = {lj2}, 
DPb(j ,'L,'Y' i{t) = 0, and DPb(x,t,f) (^) = {2}. 

Note that the restriction of disagreeing positions to positions with non-constructor 
symbols {defined symbols in the case of rewriting) disables the evaluation of sub- 
terms which could never help to produce a redex (see [1,2,14]). 

In [7], Antoy and Lucas showed that modern functional (logic) languages 
include evaluation strategies that consider elaborated definitions of ‘demanded- 
ness’ which are, in fact, related. They introduced the informal idea that some 
rewriting strategies select ‘popular’ demanded positions over the available set. 
In [9] , we formalized a notion of popularity of demanded positions which makes 
the difference by counting the number of times a position is demanded by some 
rule, and then selects the most (frequently) demanded positions covering all 
eventually applicable rules. 

Definition 7. [9] We define the multiset of demanded positions of a term t € 
T{T^X) w.r.t. TRS TZ as DPfi{t) = (B{DPi{t) \ I ^ r G TZ A rootff) = root{l)} 
where Mi 0 M 2 is the union of multisets Mi and M 2 . 

Example 11. Continuing Example 10, we have DP-jiff) = DP-jift') = {1,2,2}. 

Definition 8 (Demanded positions). [9] We define the set of demanded po- 
sitions of a term t G T{iF,X) w.r.t. TRS TZ as 



where x <m y denotes that the number of occurrences of x in the multiset M is 
less than the number of occurrences ofy. 

Example 12. Continuing Example 11, we have DPff{t) = DPf^{t') = {2} since 
even though position 1 is demanded, the redex at position 2 is the most frequently 
demanded. 




'Pos^{t,l) if'Pos^{t,l)r\'Posc{t) =0 



0 otherwise 



DPff{f) = {pG DPn{t) I 3; e L{TZ).p G DPi{t) and 

Vg G DP-jiff) : p <DPTi{t) 9 ^1? G A} 



Example 13. Consider the TRS TZ of Example 6 and the term 
t = (True V False) V (True V False). We have DPff{f) = {1,2} since po- 
sition 1 alone does not cover the rule X V True = True. 
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The following definition establishes the strategy used in natural rewriting for 
selecting demanded positions. 

Definition 9 (Natural rewriting). [9] Given a term t G T{T,X) and a TRS 
TZ, m{t) is defined as the smallest set satisfying 

, . J (A, I r) if I r G TZ and I < t 

^ ifpG DP^(t) and (q,l ^ r) G m(t|p) 

We consider m simply as a function returning positions if the rules selected 
for reduction are not relevant or they are clear from the context. We say term 
t reduces by natural rewriting to term s, denoted by t i^r} ^ simply 

t s) if {p, I r) G m{t). 

Example If. Continuing Example 12, we have m(t) = {2} and the only possible 
natural rewriting step is: B(B(T,F,T) ,B(F,T,T) ,F) B(B(T,F,T) ,T,F) 



Example 15. Continuing Example 13, it is clear that m(t) = {1,2} and, thus, 
there exist two possible natural rewriting steps: 

(True V False) V (True V False) True V (True V False) 

(True V False) V (True V False) (True V False) V True 

Now, we prove that natural rewriting is appropriately reproduced by using 
matching definitional trees and Definition 3. First, we give a class of matching 
definitional trees, called safe matehing definitional trees, where the inductive 
positions of a branch-node correspond to the demanded positions obtained by 
the operator DPff. In the following, we denote the set of rules appearing at 
leaves of a tree as leaves{T). Similarly, given a TRS TZ, we denote the set of 
rules which are instantiations of a pattern tt as rulesnfK) = {l^rGTZ\TT<V\. 
A mdt T is called total if it contains all and only the rules eventually applicable 
to the pattern of T (i.e. rulesTi{pattern{T)) = leaves{T)). 



Definition 10. Let TZ = {T , R) he a left-linear CS and P he a matching def- 
initional tree for symbol f G T> with pattern tt. We say P is a safe matching 
definitional tree for symbol f iff one of the following cases holds: 



P = hranch{TT,{oi,Pf , . . . ,P^^), . . . ,{on,Pfi , . . . ,P^J), DPf^{n) = |oi,...,o„}, 
and Pf,..., Pf }^ , . . . , Pfi , . . . , Pff are total, safe mdt ’s. 

P = leaf{TT, I r) and I r G TZ. 

P = or{Pi, . . . ,Pk)i Pi, . . . ,Pk are safe mdt’s, Pi, . . . ,Pk-i are leaves with pat- 
tern 7T, and each rule I r G leaves{P) is unique among 7}, ... ,7},. 



Note that the matching definitional trees of Figures 3, 4, 5 and 6 are safe mdt’s 
whereas the two definitional trees of Figure 1 are not. Indeed, it is worth noting 
that the construction of safe mdt’s is easily achievable by using the function 
DPf^ for branch-nodes and creating or-nodes when a pattern is a redex and also 
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has demanded positions. Furthermore, there is only one safe mdt per function 
symbol, due to the use of DP^. Now, we prove that both definitions of natural 
rewriting coincide for safe mdt’s. 

Theorem 1. Let TZ = {T,R) he a left-linear CS, f G V, t G s.t. 

root{t) = /, and Tf be a safe mdt for f. Then, mt(t, 7/) = m(t). 

4.2 Natural Narrowing 

The following definition establishes the strategy used in natural narrowing for 
allowing narrowing steps. 

Definition 11 (Natural narrowing). [9] Given a term t € T{T,X) and a 
TRS TZ, mn(t) is defined^ as the smallest set satisfying 

{ {A, id, l^r) if I ^ r gTZ and I < t 
{p.q, 9, l^r) if p G DPf^ H T’osx>{t)and {q, 9, l^r) G mn(t|p) 

(p, 00(7, l-s-r) if p G DPf^.t\p = x G X,c is in the sort of x, a{x)=c{w), 
w are fresh variables, and {p, 9, l^r) G mn(cr(t)) 

We say term t narrows by natural narrowing to term s at position p using 
substitution a, denoted by s (or simply t-^,y s), if {p,cr,l — > r) € 

mn(t). 

Example 16. Continuing Example 12, we have mn(t') = {2} and, thus, the nat- 
ural narrowing sequence B(X, B(F,T,T) ,F) B(X,T,F) T 

Similarly to the rewriting case, we prove that both definitions of natural 
narrowing coincide for safe mdt's. 

Theorem 2. Let TZ = {TF,R) be a left-linear CS, f G V, t G T{T,X) s.t. 
root(t) = f, and Tf he a safe mdt for f. Then, mnt(t, 7y) = tnn(t). 

5 Implementation 

In this section, we show that natural rewriting and natural narrowing can be im- 
plemented efficiently by reusing modern state-of-the-art compilation techniques 
developed for (weakly) outermost-needed rewriting and narrowing. Curry [12] 
is a multiparadigm programming language based on weakly outermost-needed 
narrowing which combines purely functional programming, purely logic program- 
ming, and concurrent (logic) programming in a seamless way. The Curry2Prolog 
compiler [6] included into the Curry development system PAKCS [11] is one of the 
fastest Curry implementations available nowadays. The following example helps 
to understand how the compilation of Curry programs to Prolog [6] works. 

® This definition is a simplification of that of [9] and differs only in the selection of 
demanded positions rooted by defined symbols. Note that both definitions satisfy 
the same properties described in [9]. 
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Example 17. Consider the TRS of Example 2. Basically, the technique in [6] 
translates a function symbol and its associated definitional tree to case ex- 
pressions and, then, compiles these case expressions to Prolog. For instance, 
Curry2Prolog uses the definitional tree (a) of Figure 1 and translates it to the 
following (auxiliary) case expressions, where the constructor symbol 0 is repre- 
sented by the constructor symbol z: 

M -h N = case M of z -> (case N of s(N’) -> z) 

s(M’) -> (case N of s(N’) -> s(M’ — N’ ^ s (N’ ) ) ) 

Its translation to Prolog according to [6] is as follows: 
block 

-^(M,N, Result, Ein,Eout) : -hnf (M,HM,Ein,El) ,-l_l (HM,N, Result , El ,E0ut) . 
block -h_l (?,?,?,-,?) . 

1-_1 (z,N, Result ,Ein,Eout) : -hnf (N,HN,Ein,El) ,-l_l_z_2(HN, Result ,E1 ,Eout) . 
l--l(s(M) ,N,Res,Ein,Eout) :-hnf (N,HN,Ein,El) ,l-_l_s_2(HN,M,Res,El,Eout) . 

block l-_l_z_2 (?,?,-,?) . 
l-_l_z_2(s(N) ,z,E,E) . 

block -h_l_s_2(?, . 

l-_l_s_2(s(N) ,M,s(-l(-(M,share(N,EN,RN)) ,s(share(N,EN,RN)))) ,E,E) . 

As is usual in the transformation of functions to predicates, the result of the 
function call is included as an extra argument called Result. Since the compu- 
tational model of Curry allows concurrent executions of calls, the Curry2Prolog 
compiler introduces block Prolog declarations and arguments Ein and Eout for 
each function symbol in order to control whether a function call is suspended or 
not (see [6]). Moreover, a call hnf (T,HT, Ein, Eout) is responsible for obtaining 
the head normal form HT of a term T. Finally, Curry2Prolog implements the shar- 
ing of variables using an extra symbol share. A term share (T,ET,RT) contains 
the shared term T, ET (which indicates whether T has already been evaluated), 
and RT (which is the result of T). 

In this section, we argue that it is possible to use the technique in [6] to trans- 
late matching definitional trees and natural rewriting and narrowing strategies 
to Prolog. The main difference between a gdt and a mdt is that the latter has 
more than one inductive position. Hence, we can use the previous technique for 
translating each inductive position and its set of subtrees to Prolog and include 
some extra rules when more than one inductive position exist. These extra rules 
check sequentially whether each inductive position is constructor-rooted or not 
and only start an evaluation if none of the inductive positions are constructor- 
rooted (according to Definition 4). These rules use a predicate checkC that 
succeeds when its argument is rooted by a constructor symbol. We show how 
the translation works in the following example. 

Example 18. Consider the TRS in Example 2 and the safe mdt in Example 5 
(and Figure 3) . We can apply the technique in Example 17 to each part of the root 
branch-node driven by an inductive position, and encode the pattern matching 
process into an appropriate set of rules (using predicate hnf). Then we add the 
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Table 1. Runtimes (in ms.) of different calls within Prolog compiled programs 



Benchmark 


Goal 


PAKCS 


PAKCS with Natural Narrowing 




10! 10 


26632 


27345 




± 0 


CX3 


0 




10! ^ 10! 


31360 


31212 



necessary rules (using predicate checkC) to provide the behavior associated to 
the natural narrowing strategy. Note that no extra rules (using checkC) are 
necessary when there exists only one inductive position in a branch-node. The 
compilation to Prolog yields: 

checkC (A) : -var (A) , ! ,f ail . 
checkC(z) . 
checkC(s(_)) . 

block -h(?. 

Result, Ein,Eout) :-checkC(M) , ! Result, Ein,Eout) . 

1-(M,N, Result, Ein,Eout) :-checkC(N) , ! ,1-_2(N,M, Result, Ein,Eout) . 

1-(M,N, Result, Ein,Eout) : -hnf (M,HM,Ein,El) ,1-_1 (HM,N, Result , El ,Eout) . 

block (?,?,?,-,?) . 

(z,N, Result ,Ein,Eout) : -hnf (N,HN,Ein,El) ,l-_l_z_2(HN, Result ,E1 ,Eout) . 
l-_l(s(M) ,N,Res,Ein,Eout) :-hnf (N,HN,Ein,El) ,l-_l_s_2(HN,M,Res ,E1 ,Eout) . 

block l-_l_z_2 (?,?,-,?) . 
l-_l_z_2(s(N) ,z,E,E) . 

block l-_l_s_2 (?,?,-,?) . 

l-_l_s_2(s(N) ,M,s(l-(-(M,share(N,EN,RN)) ,s(share(N,EN,RN)))) ,E,E) . 

block -h_2(?, ?,?,-,?) . 

l-_2(s(N) ,M,Res,Ein,Eout) :-hnf (M,HM,Ein,El) ,l-_2_s_l (HM,N,Res ,E1 ,Eout) . 

block _2_s_l (?,?,-,?) . 
l-_2_s_l(z,N,z,E,E) . 

l-_2_s_l(s(M) ,N,s(l-(-(M,share(N,EN,RN)) ,s(share(N,EN,RN)))) ,E,E) . 

Here, the predicate checkC is used to perform the constructor-rooted test on 
all the inductive positions. Note that we do not allow backtracking (using the 
Prolog cut ! ) in the selection of an inductive position and a subtree. Note also 
that a simple optimization is directly performed in the previous example: “when 
two inductive positions are demanded by the same set of rules, it is sufficient to 
evaluate only one of them.” Thus, only position 1 is evaluated when both 1 and 
2 are demanded (see the third clause of predicate -h). 

Finally, Table 1 shows that the compiled Prolog code of the natural narrowing 
strategy using matching definitional trees does not introduce any appreciable 
overhead while providing better computational properties. It shows the average 
in milliseconds of 10 executions measured on a Pentium III machine running 
Red Hat 7.2. The Prolog code for symbol in the third column corresponds to 
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the Prolog code in Example 17, whereas the Prolog code for the fourth column 
corresponds to the Prolog code in Example 18. However, the Prolog code for 
symbol ^ is the same for the third and fourth columns since its mdt is translated 
to the same Prolog code as the code produced by [6]. Thus, the execution times 
for symbol ^ are similar. Note that the non-terminating symbol _L is defined by 
the rule _L = _L, and the mark oo represents an infinite evaluation sequence. 

In future work, we plan to perform more exhaustive experiments. 

6 Conclusions 

We have provided an extension of the notion of definitional tree, called matching 
definitional tree, and a reformulation of natural rewriting and narrowing using 
these matching definitional trees. We have proved that both reformulations are 
correct w.r.t. to their original definitions. Moreover, we have shown that it is 
possible to implement natural rewriting and narrowing efficiently by compiling 
to Prolog. Hence, it seems possible to include the natural rewriting and narrowing 
strategies into current existing implementations without great effort. This could 
encourage programmers to write non-inductively sequential programs, whose 
sequential parts could still be executed in an optimal way. 

Acknowledgements. I am grateful to Marfa Alpuente, Salvador Lucas, and the 
anonymous referees for their helpful remarks. 
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Abstract. This paper presents a complete (infinite) axiomatization for 
an algebraic construction of graphs, in which a finite fragment denotes 
the class of graphs with bounded tree width. 



1 Introduction 

A graph is a flexible relational structure for describing problems. However, solv- 
ing graph problems can be difficult, partially because graphs lack an obvious 
recursive construction. 

The algebraic construction of graphs opens the possibility for graph algo- 
rithms that could be applied: 

~ efficient programming methodologies, such as depth-first search, divide-and 
conquer, and dynamic programming, which would enable us to design a new 
graph algorithm, and 

— program transformation techniques, which are well-developed in the func- 
tional programming community [FS96, Erw97, SHTOOO]. 

This is especially true for graphs with bounded tree width [RS86]. The class 
of graphs with bounded tree width is limited, but still contains interesting ap- 
plication areas; for instance, the control flow graphs of GOTO-free C programs 
have tree widths of at most 6 [Tho98], and those of practical Java programs 
mainly have at most 3 [GMT02]. 

A notable feature is that many NP-hard graph problems for general graphs 
are reduced to linear-time for graphs with bounded tree width [Gou90, BPT92]. 
This corresponds to the fact that algebraic constructions become finitely gen- 
erated for a class of graphs with bounded tree width [BG87, AGPS93, OHS03], 
though they are infinitely generated for general graphs. 

However, the algebraic structures referred above are not initial, i.e., the same 
graph could have several different expressions. Glarifying such equivalence could 
lead 

— a debugging opportunity of programs, i.e., programs must have no conflicts 
with axioms, and 

— efficient algorithm design for graph properties, such as graph isomorphism. 
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Our ultimate aim is to give a complete (finite) axiomatization for graphs with 
bounded tree width. This is half done; this paper presents the complete (infinite) 
axiomatization for an algebraic construction of general graphs, in which a finite 
fragment denotes graphs with bounded tree width. The idea of the proof for 
ground cases comes from [BC87]; our work further extends the completeness 
result to non-ground cases. 

This paper is organized as follows. Section 2 prepares basic notations. Sec- 
tion 3 presents an algebraic construction of graphs with infinite signatures, which 
is a variation of those in [ACPS93]. Section 4 gives the complete (infinite) axioms 
for ground terms, and Section 5 extends them to non-ground terms. Section 6 is 
a brief overview of related work, and Section 7 discusses future work. 



2 Preliminaries 

Let A be a set of function symbols and X a countably infinite set of variables. 
Each function symbol / is supposed to have its arity ar(f). A function symbol 
c such that ar(c) = 0 is called a constant symbol. The set of all terms, denoted 
by T{F,X), built from F and X is defined as follows: 

1. Constant symbols in F and variables in X are terms. 

2. li t\, . . . ,tn are terms, and / is a function symbol in F such that ar(f) = n, 

then ffti, . . . ,tn) is a term. 

V(t) denotes the set of variables occurring in a term t. A term without vari- 
ables is called a ground term, and a term in which each variable occurs at most 
once is called a linear term. The set of ground terms is denoted by T{F) for the 
set F of underlying function symbols. 

Let □ be a fresh special constant symbol. A context C[ ] is a term built 
from Fun and X. When C[ ] is a context with n D’s and F, • • • , are terms, 
C[ti, ■ ■ ■ ,tn] denotes the term obtained by replacing the f-th □ from the left in 
C[ ] with ti for each i = 1, . . . , n. 

Definition 1. A term rewriting system (TRS) is a set R o/ rewrite rules. A 
rewrite rule is a pair of terms denoted by I ^ r satisfying two conditions: (1) I 
is not a variable and (2) V{1) 2 V{r). 

If t = C[W] and s = C[r9] for I ^ r £ R and a substitution 9, t s is a 
(one-step) reduction and 19 is called a redex. 

A TRS R is terminating (or, strongly normalizing, SN for short) if there are 
no infinite rewrite sequences ti — >/{ • • • — >/{ ■ ■ ■. 

Throughout the paper, we will use G, G' for (fc-terminal) graphs, S for a set, 
X for a set of variables, s, t for terms, h, i,j, k, I for indices, and x, y for variables, 
s,t for terms, a, (3 for maps, 9 for a substitution, and tr, r for permutations, k 
is also often used for the number of terminals. I (resp. r) is sometimes used for 
the left-hand (resp. right-hand) side of a rewriting rule in a TRS. 
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3 Algebraic Construction of Graphs 

In this paper we consider graphs with undirected edges, with at most one edge 
between any two vertices, and with no edge between a vertex and itself. (Ex- 
tensions to multiple edges between vertices and to loops connecting a vertex to 
itself are easy, and sketched in Remark 2 of Section 4.) A fc-terminal graph G 
is a graph with k distinguished vertices, called terminals, numbered 1 through 
k. The set of vertices of G is denoted V{G), the set of edges of G is denoted by 
E{G), and we write G\i] for the f’th terminal of G, where 1 < i < k. Ordinary 
graphs are obtained as 0-terminal graphs. 

A /c-terminal graph G is a pair of a graph and a tuple of its k distinct vertices, 
called terminals. The f-th terminal in a fc-terminal graph G with 1 < i < fc is 
denoted by G[i] (like an array-like notation). Ordinary graphs are obtained as 
0-terminal graphs after removal of terminals. For simplicity, we consider simple 
graphs (i.e., undirected and without multiple edged) without loops; but, the 
extensions to directed graphs, graphs with multiple edges, and/or graphs with 
loops are straightforward. The set of vertices of G is denoted by V(G) and the 
set of edges of G is denoted by E{G). The number of edges from a vertex v is 
denoted by #e{v). 

Definition 2. Let Bk be sorts for k > 0. Let ©fc, rfc, cr^, e^, 0 be function 
symbols with sorts below 

( ef-. B 2 , l \. : Sfc-i ^ Bk, ©fe: B^ x B^. ^ B^, 

\ 0 '■ Bq, rk‘ Bk ^ Bk—i, (7k ■ ^ Bk- 

where i < k, j < k, and k > 0 (For readability, ©fc is an infix operation and the 
rest are prefix). Let Bn be the set of well-sorted ground terms in 

T({0, e^, ll, rk,®k,(jl \ 1 <i < k < n, 1< j < k}) 

and Boo = ■ 

A term t G Bk is interpreted as a fc-terminal graph (defined below) by inter- 
preting function symbols ©fe, rfc, cr^, e^, 0 as following operations. This inter- 
pretation is denoted by tpit)- 

Definition 3. Let V'(e^) be the edge with two terminals and ip(0) be the empty 
graph. We define operations among k-terminal graphs as 

— f’il^it)) is a lifting for 1 < i < k, i.e., insert a new isolated terminal (as a 
new vertex) to ip{t) at the i-th position in k — 1 terminals. 

— fj^rkit)) removes the last terminal from if{t). 

— tf{s ©fc t) is a parallel composition for k > 0, i.e., fuse each i-th terminal in 
i/)(s) and ijj(t) for 1 < i < k. 

— '4>{(7\{f)) is a permutation, i.e., permute the i-th terminal and the i + 1-th 
terminal in ifft) for 1 < i < k. 
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Fig. 1. An example of the algebraic construction 



Example 1. Fig. 1 shows that the algebraic construction of a (0-terminal) graph. 
Each operation, underlined in ri(r 2 (e^ 02 02 ^K^2(e^))) 03 ^|(e^)))), is 

figured in lower columns. 



Remark 1. Each permutation cr on {1, •••,/;:} is generated from cr^’s. For in- 
stance, a circular permutation is generated as 






• • • CTfc = 



fii+l--- j \ 



for 1 < i < j < k. 



Although we do not show the definition of graphs with bounded tree width, 
the characterization of graphs with tree width at most k is given by the following 
theorem. This theorem is obtained similar to that in [ACPS93]. 

Theorem 1. For fc > 0, ijj{Bk+i) is the set of graphs with tree width at most k 
(by neglecting terminals). 



4 Complete Axiomatization of Graphs: Ground Cases 

A /c-terminal graph could be denoted by different algebraic expressions; for in- 
stance, see Example 2. 

Example 2. Two terms below are equivalent and both denote the (0-terminal) 
graph in Fig. 1. 

ri(r2(e2 02 r^ilUe^ 02 ^^^ 2 ( 6 ^))) 03 ^Ke^))))) 
ri(r2((e2 02 Ur^ie^))) 02 r^ilUe^) 0g Z^e^)))) 
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In this section, we show that the (infinite) set of axioms foo (in Fig- 3) is 
sound and complete for ground terms (Theorem 2 and 3). The key of the proof 
is the existence of a canonical form that denotes a graph in which all vertices are 
terminals (see Example 3). Then, canonical forms denoting an isomorphic graph 
are converted each other by the associativity and commutativity rules of the 
parallel composition ©fc’s (ACl and AC2 in Fig. 3) and suitable permutations 
cr^’s among terminals. 

Example 3. Fig. 3 shows a transformation to obtain a canonical form of the ex- 
pression in Example 1, where R\ will be defined in Definition 6. The underlined 
parts correspond to the rewrite steps. (The infix operation ©4 has the commuta- 
tive associative axioms, and we omit parenthesis in the last line for readability.) 



ri(r2(e^ ©2 ©2 lh{r2 je'^))) ©3 ^i(e^)))) 

-Hi ri(r2(e2 ©2 ©2 r 3 (?Ke^)) ), 03 ^I(e^)))) 

— Hi ri(r2(e^ ©2 ra( Z| (r3 (Z|(e^) ©3 lUe'^))) ©2 ^l(e^)))) 

—Hi ri(r2(e^ ©2 r 3 (r 4 ( /j(Zg(e^) ©3 llje^)) ) ©3 ^l(e^)))) 

—Hi ri(r2(e^ ©2 raj r^llilUe^)) ©4 l\{ll{e^))) ©3 ^j(e") ))) 
—Hi ri(r2(e^ ©2 rs{r 4 ,[l\{ll{e^)) ©4 l\{l\{e^)) ©4 d(^i(e^)))))) 




( 2 )^ \( 4 ) 



Increase 
^ the number 
of terminals 



,+ 

Hi 



ri(r2(r3(r4(?l(Z| (e^) )©4ii!l(/l' (e=) ) ©4ii!l(;I' (e^) ) ©4i/4(?3 (e=) ))))) 




Fig. 2 . Example of transformation to a canonical form (ground case) 



Definition 4 . k-terminal graphs Gi, G2 are isomorphic if there exists a one- 
to-one onto map a : V{Gi) — V{G2) such that 

— For V € V{Gi), if V is the i-th terminal of Gi with 1 < i < k, then a{v) is 
the i-th terminal of G2, and vice versa. 

— For v,v' € V{G\), if {v,v') is an edge of G\, then {a{v),a{v')) is an edge 
of G2, and vice versa. 

Definition 5 . Two terms s,t of sort Bf^ are equivalent if the k-terminal graphs 
%f{s),ip{t) are isomorphic. 

£k iu Fig. 3 is the set of axioms indexed by k. Let £00 = £k and 

£<n = £k. By regarding each equation (axiom) as a left-to-right rewrite 
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tl ©fc t2 




t2 ©fc tl 


(Commut.) 


(ACl) 


(ti ©fc tT) ©fc ts 


= 


tl ©fc {t2 ©fc ts) 


(Assoc.) 


(AC2) 


mi-At)) 


= 


mi~-\{t)) 


1 < i < j < k 


(i-Com) 


ik{ti ©fc-i ^ 2 ) 


= 


Ikitl) ©fc lk{t2) 


l<i<k 


(i-Dist) 


/fc_i(rfc_i(t)) 


= 


rk(l'k{t)) 


1 < i < k 


(El) 


tl ©fc-i rfc(t2) 


= 


rk{lk{ti) ©fc ^ 2 ) 




(E2) 


t©fc i^(---iAo))) 


= 


t 




(E3) 


©2 


= 


A 




(E4) 


4(m) 


= 


ilK~-\it)) 


1 < i < j < k 


(cr 1-a) 


<{iim 


= 


14 At) 


1 < i < k 


(crl-b) 




= 


m 


1 < i < k 


(crl-c) 


<m)) 


= 


iU4-iA)) 


l<j+l<i<k 


(crl-d) 


4{e^) 


= 


A 




(a2) 


<rl{ti ©fc 12 ) 


= 


cr((ti)©fccr((t 2 ) 


1 < i < k 


(<t3) 


rrl-i{rk(t)) 


= 


rk{(rl{t)) 


1 < i < k — 1 


(<t4) 


1 

1 




rk-i[rk{t)) 




(cr5) 



Fig. 3. Axioms £k of the algebraic construction of graphs 



rule), its reflexive symmetric transitive closure (i.e., the finite application of 
axioms in S^o) is denoted by =s^- 

It is easy to see that each axiom in £^0 is sound. 

Theorem 2. (Soundness for ground terms) Let s,t be ground terms in Baa. 
Then, s and t are equivalent if s t. 



Theorem 3. (Completeness for ground terms) Let s, t be ground terms in 
Baa ■ Then, s t if s and t are equivalent. 



Definition 6. For axioms in £aa, let TRSs R\ and R 2 be defined as 

r i?i = {(El), (E2), {E2Y, {l-Dist), (a3), (a4)}, 

\ i?2 = {(ct1),((t2), 

where {E2)' is rk{ti) 0fc-i ^2 — > rk{ti 0^ l^it^)) for each k. 



Lemma 1. Ri and R 2 are terminating. 
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Proof. Let f) be the number of occurrences of a function symbol / in a term 
t, and let A{t, g, f) be the sum of all 5{s, /) where s is a subterm of t such that 
root{s) = g. We define the weight oj{t) of a term t by 

uj(t) = + U>a-,r{t) + Wo',©(t)) 



where 

= ^i,j,kA{t,lp(Bk), 

^cr,0(t) — ^i.j.kA(^tjO'j.j(Bk\ 

and define the lexicographic order on the weight. Then, for each reduction of 
Ri the weight uj{t) decreases, and Ri is SN. Similarly, each reduction of i?2 
decreases the weight uJa,i{t) = Sij^i^jiA{t,aj,lf), and R2 is SN. ■ 

Definition 7. Let t G Boo be a ground term of sort B}^, n = \V{tjj{t))\, and 
m = \E(rp(t))\. t is a canonical form if either 

t = rk+i{- ■ ■ r„{l^{- ■ ■ ll(0)))), 

or there exist 

— RnA ] = rk+i{- ■ -rn[ ]) with 0 < k <n, 

— Pn[ ,■ ■ ■ , ] consists of 0„ ’s, 

— Li[] has the form • ■ ^ 3 *’^ ]) > • • • > Mi,i for 1 < i < m, 

such that t = RnMRn[Li[e^],- ■ ■ ,Lm[e'^]]]- 



Lemma 2. For any term s, there exists a canonical form t € Bn such that 
s =£<„ t where n = \V{il){t))\. 

Proof. We first show that there exists t' in the form t' = Rn.k [P' [L'l [c-i], ■ ■ ■ i L\ [c/]]] 
with s =£<„ t' where 

Pn.k [ ] — * * * 'Cn [ ]; 

— P'[] consists of ©j’s, and 

~ L\[], - ■■ consist of lys and cr^Vs. 

— Ci is either or 0, 

From Lemma 1, s has an i?i-normal form t' of the form Rn^k\P'[L'i [ci], • • • , Lj[c/]]]. 
Since all vertices in are terminals and l],crt preserves a set of terminals, all 
vertices of each L'[e^] are terminals, and ©^ do not change the number of ver- 
tices, thus each ©j in P'[ ] satisfies j = n = \V{ij}{t))\. Further, from Lemma 1 
each L'yci] has an i? 2 -normal form, i.e., a cr^-free term. 

If |P('0(s))| = 0, this means tf{s) consists of isolated vertices and all cfs are 
0. Thus, L-[ ] = l^{- ■ ■ (^i[ ])) by {l-Com) and s is reduced to a canonical form 
RnALiiO]] by (ACl), (AC2), and (P3). 
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If |i?(^(s))| > 0, we can sort each L[[ ] by (l-Com). Since there exists a = e^, 
we can erase O’s by (ACl), (AC2), and (A3). Thus we assume Ci = for each 
i. If A'[ci] and L'j[cj] are equal, we can eliminate redundant L'[ci]’s by (ACl), 
(AC2), and (A4). Since each A'[ci] corresponds to an edge in tp{s) (i.e., the 
number of L'[ci]’s is the number of edges in ip{s)), we obtain a canonical form 
t = Rn,k[Pn[Li[e‘^], ■ ■ ■ , Lm[e^]]] by (l-Com) (from-right-to-left direction). ■ 

Definition 8. Let e(n, i,j) = l'!^--- • (;j“^ • • • ' ^l+i ’ ’ ' ^3(6^) for 1 < i < 

j < n (here we omit apparent parenthesis for readability) . 



Lemma 3. Let s G Boo- 'f’(s) contains an edge between the i-th and the j-th 
vertices, if, and only if, a canonical form of s contains e(n,i,j). 

Sketch of proof of Theorem 3 Let s,t G Boo such that '0(s) and '0(f) are 
equivalent. Assume that an isomorphism a : V('ip(s)) —>■ V{if(t)) satisfies the 
conditions in Definition 4. If |A('0(s))| = |A(0(t))| = 0, they have the unique 
canonical form from Lemma 2 and obviously the theorem holds. We assume 
|A(0(s))| = |A(0(t))|>O. 

From Lemma 2, we can assume that both s and t are canonical. Let s = 
Rn.k[Pn[Li[e'^],- ■ ■ ,L^[e'^]]] and t = • • • , L(„[e^]]] where n = 

|y('0(s))| = \V{'ip(t))\ and m = |A(0(s))| = \E{^f{t))\. Thus, a can be regarded 
as the permutation a on {k + 1, • • • , n}. 

Non-trivial permutation needs at least two elements, so we can assume k < 
n-2. Then from (cr4) and (ct 5), r^i(- • • ?'”«(i)) = r’flU- ■ ■ r()(t)) for /c+1 < 
i < n — 1. Since a permutation over {k + 1, ■■■ ,n} is generated by a(,’s for 
k+1 <i< n— 1, • •r”(cr(t)) = • ■r”(t)). Thus, it is enough to show 

a{P4Li[e% • • • , L™[e^]]) P(,[L[[e%- ■ ■ , L'Je% 

Since '0(s) and '0(f) are isomorphic, if there is an edge between the i-th and 
j-th vertices of tp{s), there is an edge between the 0 !(i)-th and o;(j)-th vertices of 
0(t), and vice versa. Thus, if there is an edge between the i-th and j-th vertices 
in 0(s), then, form Lemma 3, there uniquely exist Lfe[e^] and L'^,[e'^] such that 
=£<„ e{n,i,j) and L),[e2] e{n,a{i),a{j)). 

Since a{e{n,i, j)) = e{n,a{i),a{j)), 

a{P4L,[e% . . . , L^[e^]]) =£,„ P(,[L[[e%- ■ ■ , L^e^]] 

holds from (ACl), (AC2), ((t 2), and (cr3). ■ 

Remark 2. The extensions to directed graphs, graphs with multiple edges, and/or 
graphs with loops are as follows: 

— The removal of (A4) in Fig. 3 gives the sound and complete axioms for 

graphs with multiple edges. 
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— By adding a constant as a 1-terminal graph that consists of the unique ter- 
minal and the unique edge from the terminal to the terminal itself, we obtain 
the algebraic construction of graphs with loops. The axioms are preserved 
for this extension. 

~ For digraphs, instead of an edge e^, we use and et, where is the 
directed edge from the first terminal to the second, and is opposite. Then, 
the replacement of cr 2 (e^) = (cr2) with tT 2 (e^) = e?_ and lead 

the sound and complete axioms for directed graphs. 

5 Complete Axiomatization of Graphs: Non-ground 



In this section, we extend the result of soundness (Theorem 2) and complete- 
ness (Theorem 3) for ground terms to general terms. In this extension, we need 
additional axioms (ifl) and (S2) in Fig. 4, which present the defining relation 
of the permutation group [Wey39] . 

Lemma 4. [WeyS9j For any permutation a and a' that are expressed as prod- 
ucts of <j\. ’s with 1 < i < k, they are equivalent as a map if and only if a =q^ a' , 
where Qk consists of (SI) and (S2) axioms in Fig. 4- 



Fig. 4. Additional axioms Qk of the algebraic construction of graphs 



Example f. Consider the permutation of 1 and 3 among {1,2,3} 



which is represented as cr| • cr{ • (t| or • cr| • cr{ . This equivalence is obtained 
by =g, as 



Cases 



ai-ai{G) l<i<fc (SI) 

Ki<k (S2) 




• ^3 • 0-3 =i:2 cr| • cr^ • (al ■ cr |)3 • cr| 



= cr| • (cr| • cr|) ■ cr| • cr} • cr| • cr| • (cr| • cr|) 

= i;i (c^i ■ cr|) ■ cr| ■ • cr{ 

=i:i ct{ 



Remark 3. For ground terms, (Fll) and {S2) in Fig. 4 are not required, because 
the same can be performed by (crl-d) and (<t 2) in Fig. 3. 
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Let Xk be a set of variables with sort Bk- The i-th terminal of x is denoted 
by x[i\. Let X = Ufc Xk- The set of well-sorted terms in 

T({0,e^,Zfc,rfe,©fc,cr^ I 1 < * < A: < n, 1 < j < A}, X) 

is denoted by Boo{X). Define a substitution 9q by x9o = for each 

variable x £ Xk. 

Definition 9. For s,t G Boo{X), s and t are equivalent if, for each ground 
substitution 9, 'ip(s9) and ip(t9) are isomorphic. 

The next theorem is immediate. 

Theorem 4. (Soundness) Lets,t be terms in Boo{X). Then s andt are equiv- 
alent if s u Scxi 

Difficult part is completeness. 

Theorem 5. (Completeness) Let s,t be terms in Boo{X). Then s u t 
if s and t are equivalent. 

Similar to the ground case, we first consider a canonical form of a term t. 
The set of variables that appear in a term t in Boo{^) is denoted by V(t). 

Definition 10. Let t (g Baa{X)) be a term of sort Bk, n = \V {ii){t9Q))\, m = 
\E(pp(t9o))\, and V{t) = {xi, - ■ ■ ,Xm'} . t is a canonical form if either 

t = rk+i{---rr,{lZ{---ll{0)))), 

or there exist 

Lln.k [ ] — ^fc-t-l(' * *'Cn[ ]); 

~ Pn[ ,■ ■ ■ , ] consists of ©„ ’s, 

— Li[ ] has the form ln'"'~^{- ■ '^ 3 ’’^ D Ui,n -2 > ■ ■ ■ > Ui^i for 1 < i < m, 

— Lm+i[ ] has the form ln’"~'^'{- ■ ■ Q:h[ ]) with u' > • • • > u' ^ for Xi G 
Xd- and 1 < i < m' , 

— Gi is CTi(xi) for some combination <Ji of ’s for 1 < i < m' , 
such that 



i — Liji k{En{Li[e ], * * * , L^[e ], L^_|_i[Gi], ■ * * , 

Define Centerft) = ijj{Rn,k[Pn[Li[e'^],- ■ ■ ,Lm[e'^]]]). For a ground substitu- 
tion 9, let Inner {t, 9) = V{Center{f)) and Outer{t,9) = V{ijj(t9))\Inner{t,9). 
We say a vertex is inner if it is in Innerit, 9), and outer otherwise. 



Lemma 5. Center{t) is isomorphic to ^^{tdo). 
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r2(e= ©2 ©2 ij^ie^))) ©3 cr| • cr^ ai{ll {x)))) 

»+ r2(e^ ©2 ©2 rsjlUe^)) ) ©3 o-| • ai(Zg (x)))) 

r2(e^ ©2 r- 3 ( l 3 (r 3 (;g(e^) ©3 ^3(6^))) 03 o~ 3 (^ 3 (o' 2 (^))))) 

»+ r2(e^ ©2 T- 3 (r 4 ( Zl(/g(e^) ©3 ^g(e^)) ) ©3 
^ r2(e^ ©2 r 3 ( r 4 (/l(?g(e^)) ©4 ^K^g(e^))) ©3 ^i(o-2(a^)) )) 

> r 2 ( e" ©2 rsinillHUe^)) ©4 /l(^g(e^)) ©4 /|(/i(crg(x))))) ) 

r,ir^{r4ltil| (e") ) ©i (e") ) (e^) ) N /g(Zg(ag(x)))))) 




R[ 



^[] L3I ] 



Lil : 






C’enter(t) 

Fig. 5. Example of transformation to a canonical form (non-ground case) 



Example 5. Fig. 5 shows the conversion of 

t = T2 •P2(e^r3 -P3(;g •p2(e^^2 ' »^2(e^)),crg • ag • ag • Zg(x))) 

to a canonical form. The circle expresses a substitution to a variable a;, and the 
parenthesis for and the commutative associative operator ©4 are omitted. 

The next lemma is similarly proved as the proof of Lemma 2. 

Lemma 6. For any term s G ,Soo(Al), there exists a eanonieal form t G Bn sueh 
that s =£<„ t where n = |F(C'enter(t))|. 

When terms s and t are equivalent, without loss of generality, we can assume 
that s and t are canonical forms. Let us fix canonical forms s and t. 

Lemma 7. If s and t are equivalent, V(s) = V(t). 

Proof. Assume V(s) yf V(t). Without loss of generality, we can assume that 
X S V(s) and x ^ Vft). From Lemma 5, Center{s) and Centerft) are isomorphic. 
Let n = |F(Center(s))| = |l/(C'enter(t))|. 

Consider a ground substitution 0 that substitutes a term denoting Kn+i 
(complete graph with n + 1 vertices) to x, and l^ - ■ ■ ^i(O) otherwise (for those 
that in Xk). Then, \V{ip{s0))\ > \V{ip{t9))\ = n, and the contradiction. ■ 

Lemma 8. If s and t are equivalent, each variable x occurs the same times both 
in s and t. 

Proof. Assume that x occurs in s more than in t. Similar to Lemma 7, consider 
a ground substitution 0 that substitutes a term denoting Kn+i (complete graph 
with n + 1 vertices) to x, and otherwise (for those that in Xk). Then, 

\V{'ip{s0))\ > \V{ip{t0))\, and the contradiction. ■ 
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For notational clarity, we consider conditional linearization of a term. 

Definition 11. Conditional linearization of a term t is obtained by renaming 
different oecurrenees of the same variable x to distinet variables x',x",---, as- 
sociated with the side condition C = {x' = x" = ■ ■ ■} . 



From now on, we consider conditional linearization of canonical forms s and 
t. Let us fix V(s)(= V(t)) as {a;i, • • • ,Xm} with the side condition C : {xi = Xj}. 
Note that from Lemma 7 and 8, such C is well-defined. 

Next we define Xi\t,j], which is the vertex in Center ft) that corresponds to 
the j-th terminal in iffxO) for each ground substitution 9. 

Definition 12. We borrow the notation from Definition 10. Lett be a canonical 
form t — [7^^ [L 1 [e ], * * * , LjYi)e ], ■ * * , and let 

(vi,V 2 , • • • , Vn) be the tuple of terminals of 



with wi < ■ ■ ■ < Wdi ■ 

Example 7. In Example 5, a;[t, 1] = v^ and x[t,2] = v\. 

Below, we define a marker substitution 9j^, which distinguishes each ter- 
minal Xi\t^j] by the pair of its outer neighborhoods; these neighborhoods are 
distinguished each other by the number of edges in tj;{t 9m)- 

Since the number of edges and the neighborhood relation are preserved by 
an isomorphism, an isomorphism between fj^st 9 m) and if ft 9 m) induces the 
isomorphism between Center{s) and Center ft) that maps Xi[s,j] to x^ [t,j] with 
Xi = Xi' G C. 

Definition 13. Let termi, ■ ■ ■ C^rmd be vertices, and let chg, ■ ■ ■ , chd be their 
children. A rooted tree with the root vertex v and its m children is denoted by 
brfv,m). For d < h, a marker forest MF{h,d) is a d-terminal graph such that 



Example 6. Conditional linearization of a term p^{l\fp 2 {x, y)), l^ix)) is 
P3{ll{P2{xfy)),li{x”)) with {x' = x”}. 



'fp{Pn[Ll[e'^], •••, Ljn[e^], Lm-\-l[Gi], •••, Lm+m'\Gm'f\ (^o)- 



Assume that a variable Xi in t is of the sort Bd^ and let 

with Un-di > ■ ■ ■ > ui. Define Xi[t,j] = Xcr~^{wj) where 

{Wl, • • ■ ,Wdi} = {1, • • • ,n} \ {Ul,- • ■, Un-di} 



= 



V{MF{h,d))) 





U {termi}) otherwise 
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and 

E{MF{h,d)) 

u 

= <1 E{br{cho,h - d)) U (U^< /i + 2^ - 2))) 

[ U {(fermi, (term*, c/ij), (c/io,c/ii)} 

A marker term Mt{h,d) is a term that denote ME{h,d). 



if d = 0 
otherwise 




Fig. 6. d-terminal graph MF{h, d) 



Lemma 9. In MF{h, d), h + 1 < ffe{chi) < h + 2d + 1 for each 0 < i < d and 
ffe(chi) < ffe{chj) if i < j. More precisely, ffe{chi) = h + 2i+l for 0 <i < d 
and ffe(chd) = h + 2d. 

Definition 14. Without loss of generality, we can assume that are 

the representatives under the side condition C oft (i.e., Xi,--- ,xi are mutually 
distinct and for each x G Vft) there exists some Xi such that C contains x = Xi 
with I <1 i "E 1)- Let Xi S . 

Let n = \V(pp(t 0o))|- The marker substitution 9 m (see Fig. 6) is a ground 
substitution such that 

f xi 9 m = Mt{n + 2, d\) 

\ Xi+i 9 m = Mt\n + 2 + dj, di+i) for 1 <i < 1. 

Example 8. In Example 5, x9m = Mt{6,2) (see Fig 7). 




Fig. 7. Substitute MF{6, 2) to x in Example 5 
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Lemma 10. Let v G ijj{t9M) o^nd n = \V{C enter {t))\. If v is inner, 2 < 
ffe{v) < n + 2. If V is outer, either ffe{v) = 1 or ffe{v) > n + 2. 

Lemma 11. If s and t are equivalent, an isomorphism a between if{s 0 m)) 
ijj(t 9 m)) satisfies : 

— a is an isomorphism between Center(s) and Centerff). 

— For each Xi, there exists Xi’ with Xi = Xi' G C, a{'tf{xi9M)) = 4’{^i'9M), o.'^d 
o^{x^[s,j]) = X^'[t,j], 

Proof. From Lemma 10, a(V{C enter (s)) = V{Center{t)). 

Let n = |l/(C'enter(s))|. For cho in ip{xi0M), there exists Xi/ and with 
Xi = Xii G C and tf{xi>0M) such that a{cho) = c/iq for c/ig in if{xii9M) by con- 
struction. Since the unique neighborhood of c/ig satisfying 2 < ffe{cho) <n + 2 
is termi, aftermi) = term[ with term[ in ip{xii9M)- Since chi is the unique 
neighborhood of cho that has more then n -I- 2 edges, a{chi) must be ch[. Re- 
peating similar construction, Lemma is proved. ■ 

Sketch of proof of Theorem 5 By using the isomorphism a in Lemma 11, 
similar to the proof of Theorem 3, we obtain the proof of Theorem 5. ■ 

6 Related Work 

There are many works on algebraic constructions of graphs, including 

— [FS96, Erw97] for functional programming, 

~ [CS92, Has97] from the categorical view point, 

~ [MSvE94, AA95] for term graphs, 

— [Gib95] for directed acyclic graphs, and 

— [BC87, ACPS93, OHS03] for graphs with bounded tree width. 

Among them, only [BC87, ACPS93, OHS03] characterize the class of graphs 
with bounded tree width. Bauderon and Courcelle presented the complete axiom- 
atization for ground terms [BC87, Cou90] in their formalization. Their algebraic 
construction consists of the function symbols 



and their complete axiomatization is shown in Fig. 8. 

This paper gives the complete axiomatization for the variation of the alge- 
braic construction given in [ACPS93] . Our choice of formalization comes from its 
compatibility with SP Term, since SP Term seems the most suitable data struc- 
ture for programming on graphs with bounded tree width [OHS03] . The idea for 
the proof of the completeness for ground cases (Section 4) comes from [BC87]; 
this paper further extends the result to non-ground cases (Section 5). 




where their interpretation is 

“ V'(®m,n(ti, ^ 2 )) is a disjoint union of ipiti) and ip{t 2 ), 

— '4>{9ij^n{t)) fuses j-th and j-th terminals for 1 < i < j < n, and 



— 'i/'((TQ,(t)) renumbers a(*)-th terminal as i-th terminal for a : [l..m] — > [l..n]. 
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(s © t) © M 



S © (t © u) 



(Rl) 

(R2) 

(R3) 

(R4-1) 

(R4-2) 

(R4-3) 

(R5) 

(R6) 



^ (3 ' ^ a (t) 
Oid{t) 



t 




-^p) (f © s) 




t 



if a : [p] ^ [n],a' : [p'\ [m] 



,j' 




(R7) 

(R8) 

(R9) 

(RIO) 




— (Tq, • ^a(i),a(j'),Ti(f) if ^ • [^] ^ [^] 

— Op ' 

if a{m),P{m) € {i,j} or o?(m) = /3(m) for each m. 
= t 



t©0 



(Rll) 



where a- {i + p) = a{i) and ^rn = rn + a{j). 



Fig. 8. Axioms of algebraic construction of graphs in [BC87, Cou90] 

7 Conclusion and Future Work 

This paper presents the complete axiomatization for the variation of the algebraic 
construction given in [ACPS93]. Compared to the original algebraic construction 
in [ACPS93], we add (which is needed for completeness; the parallel compo- 
sition pk has the different infix notation ©^ for readability), and omit Sk, which 
is defined as 



Our final goal is to give the complete (finite) axiomatization of SP Term 
SPk [OHS03], which precisely denotes graphs with tree width at most k. SP 
Term would be the most desirable algebraic construction for writing a functional 
program on graphs with bounded tree width, because it has only 2 functional 
constructors: the series composition Sk and the parallel composition ©^ (though 
it has relatively many constants ek{i,j) and k, which can be treated in a homo- 
geneous way). We will use two approaches, one from rewriting and another from 
graph theory. 

— We already know the complete axioms on Boo, which consist of terms con- 
structed from ©fc, rfc, (T^, e^, 0. We can define Sfe,efc(f,j),k like “macros”. 
Can we deduce equations on “macros” from equations on terms constructed 
from original function symbols ? 







r2(e^ ©2 ^ 2 (^ 1 )) 

i^k+i (^1) ®fc+i ■ ■ ■ ®fe+i 



if fc = 1, 
iffc>2. 
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— Minimal separator of a graph is essential for graphs with bounded tree width. 
We hope that the Menger-like property [Tho90] would help. 
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Abstract. For equational specifications validity coincides with deriv- 
ability in equational logic, which in turn coincides with convertibility 
generated by the rewrite relation. It is shown that this correspondence, 
essentially due to Birkhoff, can be generalised in a uniform way to sub- 
equational logics such as Meseguer’s rewriting logic. 



1 Introduction 

In order to motivate and state our generalisation, we illustrate the essential 
ingredients of the usual correspondence (see, e.g. Chapter 7 of [1] or Chapter 3 
of [2]) between validity, derivability and convertibility by means of the following 
equational specification £M.ul of addition and multiplication: 



and the equation: 



A{x, 0) f 


X 


(1) 


A{x,S{y)) f 


s S(A(x,y)) 


(2) 


M(x,0) f 


s 0 


(3) 


M(x,S(y)) f 


s A(x,M(x,y)) 


(4) 


M(S(a;),S(0)) « S(a;) 


(5) 



On the one hand, (5) is valid for the specification EAiul in the sense that it 
holds in any model. In algebraic semantics, terms are giving meaning by means 
of an algebra. The algebra is then called a model of the specification if each 
equation in the latter holds in the former. That is, the meanings of the left- and 
right-hand side of the equation are identical, for any assignment to the variables. 
For instance, the algebra Mat having the set of natural numbers as carrier, 
and interpreting 0, S, A and M as zero, successor, addition and multiplication, 
respectively, is a model of EAiul and one easily verifies that (5) holds in it. For 
instance, for the assignment a mapping every variable to the natural number 2, 
its left-hand side M(S(x), S(0)) is mapped to (2 -|- 1) x (0 -I- 1), and its right-hand 
side S(x) to 2 -I- 1, i.e. both sides are mapped to 3. 



Y. Kameyama and P.J. Stuckey (Eds.): FLOPS 2004, LNCS 2998, pp. 180-195, 2004. 
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On the other hand, (5) being the conclusion of the proof tree: 

( 3 ) 



■(4) 



S(a;) ^ S(ai) 



M(ai, 0) 0 

(ref) {o 

M(S(a:),0) ai 0 



■( 1 ) 



M(:e, S{y)) ^ k{x, M(a;, y)) 
M(S(a;), S(0)) A(S(a;), M(S(ai), 0)) 



■ (o 



A(S(rE),M(S(a;),0)) A{S{x),0) 



(A) 



k{x, 0) ^ X 
A{S{x), 0) S(rr) 



• (c^) 



A(S(a;),M(S(ai),0)) S{x) 



M(S(ai),S(0)) S{x) 



• (trans) 



with substitution cr such that x i— > S(x) and j/ 1 — > 0, shows that it is derivable in 
equational logic (see Table 1). 

On the gripping hand, convertibility of the sides of (5) is witnessed by: 
M(S(a:),S(0)) ^ A(S(x), M(S(x), 0) ) ^ A(S(a:),0) ^ S(a;) 

a sequence of forwards (and possibly backwards) rewrite steps. 

We will refer to the correspondence between validity and derivability as 
Birkhoff’s theorem since it is due to [3], and to the correspondence between 
derivability and convertibility as logicality (cf. [4]). Both correspondences are 
of fundamental importance in the study of programming language foundations, 
see [5], and can be seen as a justification of term rewriting itself. For instance, 
they allow for solving uniform word problems by means of complete term rewrit- 
ing systems. 

As argued by Meseguer, e.g. in [6], some specifications should not be con- 
sidered to be equational. For instance an equational specification of a binary 
choice function ? (selecting either of its arguments) does not make sense, and 
would result in all terms being identified to one another. Instead an ordering 
specification is appropriate here: 



l{x,y) ^ X 

^{x,y) ^ y 

As suggested by the notation, in a model of such an ordering specification each 
left-hand side should be greater than or equal to the corresponding right-hand 
side. Then, to salvage the correspondence between validity and derivability, the 
symmetry rule (sym) should dropped from the proof system of equational logic 
in Table 1, resulting in ordering^ logic. In order to regain the correspondence 
between derivability and convertibility, backwards steps should be dropped from 
the latter. After this is done, both correspondences hold again as shown in [6]. 

Here we propose to generalise the correspondence as presented above for 
equational and ordering logic, to so-called sub- equational logics obtained by drop- 
ping a subset of the inference rules of equational logic. In particular, equational 
and ordering logic are obtained by dropping nothing (the empty set) and the 

^ Beware: in this paper we will use a systematic naming scheme. For instance, our 
ordering logic is known in the literature as rewriting logic. 
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s ^ t 



s ^ t 



(s ~ t G E) 



a{s) « a{t) 



{a:X^T{E,X)) 



Si ~ ... Sn ~ in 



(/e^) 



SRit SRit t ~ U 

(ref) (sym) (trans) 

SRiS fRiS s ~ u 



Table 1. Equational logic 



singleton {(sym)}, respectively. We argue that sub-equational logics are interest- 
ing for the same reason that ordering logic is interesting [6]: enforcing too many 
inference rules would conflate notions which one would like to keep distinct. Just 
as confusing the forwards and backwards directions (as enforced by symmetry) 
would be a brutal [6] act in case of the (non-confluent) ordering specification for 
choice above, to confuse ‘not being able to do anything’ with ‘being able to do a 
trivial step’ (as enforced by reflexivity) is a brutal act in case of a (terminating) 
specification such as 






Similarly, single-steps should, a priori, not be confused with many-steps (as 
would be enforced by transitivity) in case of a step specification (a.k.a. term 
rewriting system) , since by the choice for that form of specification one implicitly 
specifies that one is interested in individual steps (think e.g. of complexity). 

Based on the above ideas, we give a parametrised account of both Birkhoff’s 
theorem (Section 3) and logicality (Section 4) for sub-equational specifications 
(Section 2) . The proofs of our results are simple, as they are just variations on the 
existing simple proofs for equational logic. The main effort will be in formalising 
both the results and their proofs in a way which allows for their parametrisation. 
As a side-effect of this parametrisation the proof structure becomes clearer, which 
may be of some didactic value. Because of it, we have made an effort to make 
the paper self-contained. 



2 Sub-equational Specifications 

A sub-equational specification can be thought of as an equational specification 
together with a set of inference modes specifying how its equations are to be 
interpreted (e.g. indeed as equations, or alternatively as rewrite rules, or . . . ). 

Definition 1. A signature {f,g,hG)X is a set of symbols, each of which comes 
equipped with a natural number arity. 

The subset of E consisting of all symbols of arity n, is denoted by Elements 
of are called constants. Throughout, we assume {x, y, z€.)X to be a signature 
disjoint from S, consisting of an infinite number of constants called variables. 
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Definition 2. The set {s,t,uG )T{E) of Z’-terms is inductively defined by: 

— fsi ■ . - Sn is a term, if f is an n-ary symbol and si,. . . ,Sn are terms. 

The set T{E,X) of i7-terms over X is defined as T{S\JX). 

As is customary, we may write /(si, . • . , Sn) to denote fsi . . . s„. 

Example 1. Consider the signature E consisting of the nullary symbol 0, the 
unary symbol S and the binary symbols A and M. Some A-terms are 0, SO, SSO, 
AOO and MAOSOO. E.g. the last term is also denoted by M(A(0, S(0)), 0). An example 
of a A-term over X is A(x, S{y)). 

Definition 3. A A-statement is a pair of E -terms. 

Definition 4. A (sub-equational) specification is a quadruple S := (A, X, S, L) 
with S a set of EUX- statements and L a set of inference modes which is a subset 
o/ {(embedding), (compatibility), (reflexivity), (symmetry), (transitivity)}. 

We will abbreviate the respective inference modes to (emb), (comp), (ref), (sym), 
and (trans). The idea is that for a sub-equational specification S := (A, X, S, L), 
the modes of inference will specify how the pairs in S are to be dealt with, both 
at the semantical and the syntactical level (both to be presented below) . 

Example 2. An equational specification is a sub-equational specification having 
EL := {(emb), (comp), (ref), (sym), (trans)} as modes of inferences. A statement 
(s, t) of such a specification will be called an equation, and written as s « L 
The equational specification EXiul in the introduction consists of four A- 
equations over X, that is, A U X-equations, with A as in Example 1. 

Example 3. In an ordering specification all modes of inference except for (sym) 
are present RL := {(emb), (comp), (ref), (trans)}. ^ A statement {s,t) of such a 
specification will be called an ordering, and written as s ^ t. 

The specification of ? in the introduction is an ordering specification. 

Similarly term rewriting systems^ are rendered as sub-equational specifications 
by taking {(emb), (comp)} as modes of inference. Its statements are written using 
as usual. The TRS corresponding to £Mul will be denoted by TZMul. 

We will not list all possible sets of inference modes, but only mention one 
more example, which will be used later. 

Example 4- Removing {(ref), (sym)} from the modes of inference of £M.ul yields 
what we call a positive ordering specification EXiul, having as fourth component 
TL := {(emb), (comp), (trans)}. A statement (s, f) of such a specification will be 
called a positive ordering, and written as s > t. 

^ This corresponds to Meseguer’s rewriting logic. 

® To be precise, our term rewriting systems (TRSs) correspond to the pseudo-TRSs 
of [1, page 36] since we do not impose the usual further restrictions on rules. 

^ Note that although transition system specifications usually employ the ^-notation 
as well, the (comp) -inference mode is absent for them. 




184 



Vincent van Oostrom 



3 Birkhoff 

In this section the correspondence between validity (Subsection 3.1) and deriv- 
ability (Subsection 3.2) for sub-equational specifications is presented in two 
stages. In Subsection 3.3 we first present a correspondence between relational 
validity and derivability, which is then extended in Subsection 3.4 to a corre- 
spondence between validity and derivability by a quotient construction. 

3.1 Validity 

As usual, algebras are used to give meaning to the terms of a sub-equational 
specification. However, the notion of validity of a statement (s, t) with respect 
to a specification S will now be parametrised over its modes of inference. 

Definition 5. A 27-algebra A consists of a carrier set A, and a mapping that 
associates with each symbol f G 27^"^ a function f^-.A^^A, for every n. 

An assignment is an A-algebra. For a 27-algebra A and an assignment a having 
the same carrier, AU a denotes the obvious 27 U A-algebra. 

Example 5. 1. The algebra Af at of the introduction is a 27-algebra, for the 

signature 27 of Example 1. For the same carrier, a of the introduction is an 
example of an assignment. 

2. The E-term algebra T(27) has T(27) as carrier, and interpretation defined 
by, for all n, all / G 27^”\ and all si, . . . , Sn G T(27): f'^^^^si , . . . , s„) := 
/(si, . . . , Sn). 

Definition 6. A 27-homomorphism h from a E-algebra A to a E-algebra B, is 
a map from the carrier A of A to the carrier B of B, such that for all n, all 
f G 27(”), and all oi, . . . , a„ G A: h{f-^{ai, . . . , a„)) = /^(/i(ai), . . . , /i(a„)). 

It is easy to see that T(27) is initial among 27-algebras, i.e. for any 27-algebra 
A, there is a unique homomorphism from T(27) to A, which we denote by |A]. 

Example 6. 1. The unique homomorphism \Af at U a] which maps T(27, A) to 

Nat U a, with Nat and a as in Example 5.1, is concretely defined by: 

— X ^ 2, for X G X 

- /(si,...,s„) 1 -^ /-^“‘(ni, . . . ,n„), for /G 27 and s* i-^ m 
For instance, M(S(x), S(0)) is mapped to (2 -|- 1) x (0-1- 1), i.e. to 3. 

2. A substitution is the unique homomorphism |T(27, A) U u] of some assign- 
ment a. For instance, if a assigns S(S(0)) to x, then applying the substitution 
to M(S(x),S( 0)) yields M(S(S(S(0))), S(0)). We will often abbreviate the sub- 
stitution to just cr. 



Definition 7. Let S := {E, A, S, L) be a specification. A relational model of S 
is pair (A, R) consisting of a E-algebra A and a relation R on the carrier A of 
the algebra, satisfying each rule £ in Table 2, for £ G L. Here 
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s Rt 



(emb, (s, t) £ S) 



tti , . . . , dji [^] , . . . , hji 



(comp, f £ S) 



a R a 



a Rb a Rb b R c 

(ref) — - — (sym) (trans) 



b R a 



a R c 



Table 2. Relational models 



— s Rt expresses that for all assignments a, it holds {A U o;](s) i? |-4 U a](t) . 

— =[-R] expresses that corresponding components of a\, ... ,an and bi,...,bn 
are identical, except for one index, say i, for which ai Rbi. 

{s, t) is relationally valid in S, ^ s S t, if s Rt holds in every relational model 
{A,R) ofS. 



Remark 1. Since s R t depends on A as well, formally we should consider it to 
be an abbreviation of s R_a t. One may think of relational models as models of 
a predicate logic with one binary predicate symbol. 

The (comp)-rule is a direct generalisation of the usual compatibility rules found 
in mathematics and rewriting. In the following examples, the relational models 
for equational, ordering, and positive ordering specifications are characterised. 
To that end, recall that a relation i? is a congruence relation for an algebra A, if 
it is an equivalence relation which is preserved by the operations of A, i.e. such 
that for every n-ary operation (f, ii a\ R h\, . . . ,On R bn, then . . . , a„) R 
(/)(&!, . . . , bn).^ In each example, we will assume the relational model to be {A, R). 



Example 1. . In case of an equational specification, R is seen to be a congruence 
relation as follows. Since {(ref), (sym), (trans)} C EL, R is an equivalence rela- 
tion. To see that it is a congruence relation, suppose <f is an n-ary operation in 
A and a\ Rb\, ... On R b„, then we conclude from 

(j>{ai,...,a„) i?0(&i,...,a„) 

R (j){bi, . . . ,bn) 



using {(comp), (trans)} C EL. 

Models of equational specifications in the standard sense of the introduction 
give rise to relational models, just by pairing them up with the identity relation 

® Hence the distinction between compatibility and congruence is that the latter re- 
quires all corresponding premisses to be related, whereas the former requires exactly 
one pair of corresponding premisses to be related (and the rest to be identical). 
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id. For instance, {Afat, id), is a relational model of SAAul. id is trivially a con- 
gruence relation, and (emb) is forced to hold by the assumption that Nat is a 
model, in the standard sense, of SMul. 

However, note that R is in general not forced to be the identity relation. For 
instance, an example of a relational model for the equational specification SAAul 
consists of its term algebra T{S,X) and the convertibility relation (see 

Example 16). 



Example 8. In case of an ordering specification, R is seen to be an operation- 
preserved quasi-order. That it is a quasi-order, i.e. reflexive and transitive, fol- 
lows since {(ref), (trans)} C EL. That it is operation-preserved follows as in the 
previous item. 



Example 9. In case of a positive ordering specification, the relation R is an 
operation-preserved transitive relation. 



Example 10. Any relational model for the equational specification SAAul is is 
automatically a relational model for its associated TRS TZAAul. Of course, this 
does not hold the other way around. For instance, combining the polynomial 
interpretation of [1, Example 6.2.13] with the natural order > on the natural 
numbers yields a relational model of TZAiul, but not of SMul, because > is 
not symmetric. Although symmetry is lacking, transitivity is not, hence this 
interpretation is a model of the positive ordering specification EMul. 

As the first example shows, there is a mismatch between the notion of a model 
and that of a relational model. It is analagous to the difference between the 
notions of model of predicate logic with and without equality: in the former the 
interpretation of the binary equality predicate is fixed to the identity relation, 
whereas in the latter its interpretation can in principle be any relation (possibly 
satisfying some constraints). That is, there are many more relational models 
than there are models. This mismatch will be overcome in Subsection 3.4. 

3.2 Derivability 

Definition 8. The judgment that a statement (s,t) is derivable by means of sub - 
equational logic, for a given sub- equational specification S, is denoted by\~ s S t. 
The axioms and rules of sub-equational logic are the ones listed in Table 3. The 
theory S_ of S is the relation on terms, consisting of all derivable pairs. 

Here derivability of a statement means that it is the conclusion of some proof 
tree built from the inference rules, as usual. Note that an inference rule only 
applies when it is an allowed mode of inference, according to the specification. 
Furthermore, not all modes of inference of standard equational logic as pre- 
sented in Table 1 are (directly) at our disposal in sub-equational logic, not even 
for an equational sub-equational specification, where all modes of inference are 
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(emb, (s,t) e S, a-.X^T{E,X)) 

a{s) S_ a{t) 



Si , . . . , Sn — [ 5 ] ^ 1 ! • • • ; in 
(comp, f E E) 

/(Sl, ...,Sn)S f{tl, ...,tn) 



a S_t 

(ref) (sym) 

s 5 s i 5 s 



a S_t t Su 

(trans) 

a 



Table 3. Sub-equational logic for sub-equational specification S := {S, X, S, L) 



available. The reason is that the standard inference rules of equational logic 
exhibit some dependencies which we have avoided here, in order to make the 
connexion between syntax and semantics smoother. In particular, the equation- 
and substitution-rule have been merged into the (emb)-rule. Furthermore, the 
(comp)-rule allows one to relate only one argument at the time whereas the stan- 
dard presentation has a congruence-rule. Nevertheless, the two presentations are 
easily seen to be equivalent as illustrated by the following example. 

Example 11. Redrawing the proof tree of the introduction for the sub-equational 
specification corresponding to SAiul, omitting parentheses and EXiul to save 
space, yields 



(<t) 

(M(Sa:, 0), 0) 

(comp, A) ((t) 

(A(Sa:, M(Sa:, 0)), A(Sa:, 0)) (A(Sx, 0), Sa:) 

(a) (trans) 

(M(Sa:, SO), A(Sa:, M(Sa:, 0))) (A(Sa:, M(Sa;, 0)), Sa:) 

(trans) 

(M(Sa;,S0),Sa;) 



More precisely, for a given equational specification £, its derivability in equa- 
tional logic £ \- s ~ t coincides with its derivability \- s £ t vci equational sub- 
equational logic. 



3.3 Relational Term Model 

Derivability can be related to relational validity, by constructing a so-called 
relational term model for a specification. 

Lemma 1 (Term Model). \~sStiff'^sSt, for any specification S. 

Proof. Define the relational term model M{S) of a sub-equational specification 
S := {X, X, S, L) as the pair (T(T',X),5), where T{E,X) is the term algebra 
and S_ the theory of S. 

To prove the if-direction (completeness), it suffices to prove that M{S) is a 
relational model for S, by the choice of the theory of S as the relation of M(5). 
Since T{X,X) is a H U X-algebra by Example 5, it certainly is a if-algebra. 
Hence to verify that At (5) is indeed a relational model for S, it remains to show 
that rule £ holds in theory 5, for each £ G L. Intuitively, this will hold by the 
1-1 correspondence between the rules of relational models in Table 2 and the 
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inference rules of sub-equational logics in Table 3. For a proof, we distinguish 
cases for the rules. 

(emb) Suppose (s, t) G S. By the (emb)-rule of Table 2, we must verify that 
for any assignment a, it holds that |T(i7,X) U a](s) is related by S to 
\T{S,X) U a](t). By the definition of substitution, this is just the same as 
saying that cr(s) is related by S_ to a{t) for any substitution cr. Which holds 
by the (emb)-inference rule of the logic. 

(comp), (ref), (sym), (trans) Each rule in Table 2 directly follows from the 
corresponding rule of Table 3, where (comp) also uses that symbols are 
interpreted as themselves in relational term models. 

To prove the only-if-direction (soundness), it suffices to prove by induction on 
derivations (proof trees) that pairs in the theory S_, are related in any relational 
model {A, R) of S. The proof is by cases on the modes of inference in L, showing 
that the statement holds for a proof whose conclusion uses inference rule by 
using rule ^ of Table 2. 

(emb) Suppose (s,t) G S and let a be some subtitution. We have to show 
I^U a](cr(s)) i? I^U a](cr(t)), for any assignment a. Suppose we can show 
the so-called semantic substitution lemma: 

1^ U a] (cr('u)) = 1^ U ao\ (u) (6) 

where the assignment a^- maps a variable x G X to the value of a{x) in the 
algebra A under the assignment a, i.e. to \A U 0 !](CT(a;)). 

Then we are done, since 

I^U a](CT(s)) = I^U a,j](s) R |^U a,^](t) = |^U a](CT(t)) 

by (emb) of Table 2, and the semantic substitution lemma (twice). 

It remains to show (6), which is proven by induction on u G T{E, X). 
(variable) 

|^Ua](cr(x)) = aa{x) 

= Ia<T](a:) 

= I^U Oo-Ka;) 

(symbol) For all n, all / G and all si,. . . ,s„ G T{S, X): 

|^Ua](cr(/(si,...,s„))) = |^Ua](/(CT(si),...,cr(s„))) 

= /^^“(I^U a](CT(si)), . . . , I^U a] (cr(s„))) 
=IH U a,r](si), . . . , 1^ U a,rl(Sn)) 

= /-^(|^Ua,^](si),...,|^Ua,^](s„)) 

= (I^ U a^Ksi), . . . , U a,.](s„)) 

= |^Ua,j](/(si,...,s„)). 

Which concludes the proof of the semantic substitution lemma. 

(comp), (ref), (sym), (trans) As for the other direction, these are trivial. □ 
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Note that what we have really is a term model, i.e. terms are interpreted as terms 
(even stronger: as themselves), unlike the standard term models where terms are 
interpreted as equivalence classes of terms. The latter will be constructed in the 
following subsection. 

3.4 Quotienting Out a Maximal Congruence 

To overcome the mismatch between relational models and models observed 
above, we show that any relational model can be turned into a model, by quoti- 
enting out a maximal congruence relation. Quotienting out a congruence relation 
= consists in taking =-equivalence classes of elements as new elements. 

Definition 9. Let M := {A, R) be a relational model of S := {S, X, S, L) and 
let = he a eongruence relation on the earrier A of A. The quotient M./si of Ai 
by = is the pair {Aj R/ defined by: 

— The quotient algebra Af^ of A by = is defined by: 

• The earrier A/ ^ of A j eonsists of the =-equivalence classes [ajs for 
a&A. 

• The interpretation of symbols is given by: for all n, for all f € and 

all ai,. . . ,Un G A 

f^/-{[a^h,...[a^h) := [f^{a^, . . . ,a^)h 

— The relation R/t^ on the carrier A/,^ of Aft^, is defined by: 

[a]s R/si := a = ; R; = b, where ; denotes relation composition 

Neither the definition of the quotient algebra nor of the quotient relation depends 
on the choice of the representatives, because = is a congruence relation. Under 
some constraints, taking quotients preserves and ‘reflects’ modelhood. 

Lemma 2 (Quotient). Let S := {X,X,S,L) be a specification, M := {A,R) 
be a relational model of S, and = a congruence relation on the carrier A of A. 

— Lf= C R* , then Ai/si is a relational model of S again. 

— If moreover (trans) G L, then [ajs R/^i [6]si implies a Rb. 

Proof. We first show the first item. Let S := {X,X,S,L). We must verify for 
each £ G L, that if R satisfies the inference rule £ for Ai, then R/^ does so for 
At /s . Except for the (emb) rule all cases are easy: 

(comp) We have to show that [aijs, . . . [o„]=i =[i?/s] . . . [6n]a: implies 

([ttijs, . . . [a„]si) R/si f^^~ {[bi]si, ■ . ■ [^n]£i)- By the assumption it holds 
Oi = bi, except say for j, for which aj =; R;= bj. By (comp) for R and 
congruence of =, we obtain /"^(ai, . . . , a„) = ; i? ; = /■^(6i, . . . , 5„), from 
which the claim follows by definition of f-^^- . 

(ref) If R is reflexive, then = ; R; = is reflexive by the assumption that = is 
a congruence relation hence reflexive, so R/si is reflexive as well. 
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(sym) If R is symmetric, then = ; i? ; = is symmetric by the assumption that 
= is a congruence relation hence symmetric, so R/t^ is symmetric as well, 
(trans) If R is transitive, then = ; ; = is transitive by the assumption that 

= is contained in the reflexive-transitive closure of R, so R/t^ is transitive 
as well. 

It remains to verify the (emb) rule holds for R/ ^ under the assumption that it 
holds for R. So suppose (s, t) G S. We have to show 

lA/^udjis) R/^lA/^upm 

for any assignment /3 of =-equivalence classes of A, to variables. We will show 
the so-called syntactic substitution lemma: 

|^/sU/3](m) = [|^U a]('u)]£i (7) 

for any assignment a ‘picking’ elements from those classes, i.e. such that a maps 
each variable x to an element of . Then we conclude, by definition of R/ 

U/3](s) = [|^U a](s)]£i i?/s [|^U a](t)]s = lA/^Uf3j{t) 

using the assumption that |-4 U o;](s) R |^Ua;](t) for any a. It remains to 
show (7) for all u G T(S,X), which we prove by induction on u. 

(variable) Since a was assumed to pick elements from /3: 

U/3](a;) = = [|^ U a](x)]s. 

(symbol) For all n, all / G and all ui,. . . ,Un G T{S, X): 

=IH U a](ui)]s, . . . , [[^U al(w„)]^) 

= ([[-4 U a](Mi)]£i, . . . , [|^ U al('u„)]s) 

= [/■^(I-4Ua]('Ui),...,|^Ua]('u„))]£i 
= U oKui), . . . , 1^ U a]('u„))]£i 

= [[-4Ua](/(Mi,...,M„))]si. 

Showing the second item is easy: by deflnition [ajs R/^ iff a = ; i? ; = 6. 
By the assumption = C R*, this implies a R* ; R ; R* b, from which a Rb follows 
by the assumption (trans) G L. □ 

Example 12. Consider the relational model {T{S,X),^*^j^.^i) of SMul of Ex- 
ample 7. Taking for = the convertibility relation s®® that it satisfles 

the first condition of Lemma 2, hence that the convertibility relation itself can 
be quotiented out. As one easily checks this yields a relational model having 
the classes of convertible terms as elements, and having the identity relation id 
as relation. Note that the first component of the resulting model, is a model in 
the sense of the previous section. That is, we have constructed a model from a 
relational model. 
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The construction in the example can be generalised in the sense that if the rela- 
tion of a relational model contains a non-trivial congruence it can be quotiented 
out. In fact, we take this as the defining property of a model. 

Definition 10. A model of S is a congruence-free relational model. Here, a 
relational model (.4, R) of a specification S is congruence-free if the reflexive- 
transitive closure R* of R contains no congruence relations other than the iden- 
tity relation id. We say (s, t) is valid, written \= s S t, in case s Rt in all models 
(A,R) ofS. 

Hence, validity is obtained from relational validity by restricting the relational 
models to models. 

Proposition 1. For any relational model M. := {A,R), Al/s is a model where 
= is a maximal congruence relation =, such that = C R*. 

Proof. That a maximal congruence relation exists follows from Kuratowski’s 
Lemma, since the union of the congruence relations in a chain is easily seen to 
be a congruence relation again. That the quotient At/s is a relational model 
follows from Lemma 2, and that it is congruence-free holds, since otherwise the 
‘offending’ congruence =' could have been composed with = right away. More 
precisely, in such a case, defining a to be related to b iff [a]=i = [b]^, would 
have given a congruence relation on A still contained in R*, but larger than = 
contradicting the latter’s maximality. □ 

Let R be the relation of a relational model for S. 

Example 13. As seen above, R itself is the maximal congruence in the case of 
an equational specification, and models, i.e. congruence-free relational models, 
are in 1-1 correspondence with the models of the introduction. That is, for an 
equational specification £, the standard and sub-equational notions of validity 
coincide. 

Generalising the example, one notes that if R is both transitive and operation- 
preserved, such as is the case for (positive) ordering logic, then the reflexive 
closure of i? n {R~^) is the largest congruence relation contained in R* . This 
maximal congruence just identifies all objects in strongly connected components 
of R. (Note that if R is terminating, then quotienting does nothing.) Hence 
models of ordering and positive ordering specifications have partial orders (re- 
flexive, transitive and anti-symmetric relations) and positive orders (transitive 
and anti-symmetric relations) respectively, as relations. 

Example I 4 . The models of ordering specifications are better known as quasi- 
models [1, Definition 6.5.30]. 

By the quotient construction, checking validity on relational models can be re- 
stricted to checking validity on models in case of transitive specifications, that 
is, which have (trans) as mode of inference. 
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Lemma 3. ^sStijf\=sSt, for transitive specifications S. 

Proof. The only-if-direction holds since models are a special case of relational 
models. The if-direction follows, since by Proposition 1, any relational model of 
S gives rise to a model, in which s and t are related by the assumption ^ s 5 t, 
but then s and t were related in the relational model as well, by the second item 
of the Quotient Lemma 2 using the assumption that S is transitive. □ 



Theorem 1 (Birkhoff). \- s S t iff \= s S t, for transitive S. 

Proof. By Lemmas 3 and 1. □ 



Example 15. For equational specifications this is just Birkhoff’s theorem [3]. 

For ordering specifications, the theorem states the correspondence between valid- 
ity w.r.t. Zantema’s quasi-models and derivability in Meseguer’s rewriting logic 
(using their own terminology), a result originally due to [6]. 

4 Logicality 

We present a uniform method to define convertibility relations for sub-equational 
logics (Subsection 4.1) and show their logicality [4] (Subsection 4.2), i.e. show that 
convertibility coincides with derivability for sub-equational specifications. 



4.1 Convertibility 

Definition 11. Let S be a sub-equational specification with modes of inference 
L. Its sub-convertibility relation S{-^) is obtained by starting with the empty 
relation and closing under the inference rule £ of sub-equational logic if £ G L, 
in the order: (emb), (comp), (ref), (sym), (trans). 

Let 5 be a sub-equational specification. Of course, in case of a rewriting specifi- 
cation, having {(emb), (comp)} as modes of inference, 5(^) is just the rewrite 
(step) relation generated by the rules. Other examples are: 

Example 16. 1. For an equational specification 5(— >) is convertibility <-^5. 

2. For an ordering specification 5(— >) is rewritability/reachability 

3. For a positive ordering specification 5(^) is positive reachability 

Further examples one could think of are e.g. head steps ({(emb)}) for modelling 
process calculi. Identity ({(ref)}) then 5(^) is just syntactic identity, or Empty 
(0) for which 5(^) is the empty relation. 
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4.2 Closure 

We prove that derivability coincides with convertibility for a given sub-equational 
specification. As convertibility is defined as a special case of derivability, i.e. by 
applying the inference rules in the order as given in Definition 11, it is clearly 
contained in it. To show the other inclusion it suffices to prove that closing under 
an inference mode preserves closure under inference modes earlier in the order, 
since then the generated relation must coincide with derivability as the latter is 
the least relation closed under each inference mode. We illustrate this by means 
of an example. 

Example 17. Suppose the relation R is compatible and we take its symmetric 
closure yielding We must show that compatibility is preserved. That is, 

we must prove that /(si, . . . , s„) R U R~^ f{ti, . . . , holds, under the assump- 
tion si, . . . , s„ =[i? U ti,. . . , tn- We distinguish cases according to whether 
compatibility is due to R or R~^ holding between two premisses. 

— If compatibility is due to R, then the result follows by (comp) for R. 

— If the assumption is due to R~^, then ti, . . . , =[.R] si, • • • , Sn, hence by 

(comp) /(ti,...,t„) R /(si,...,s„), hence by (sym) /(si,...,s„) 

Checking preservation for all other combinations is as easy. 

Proposition 2. Closing relations in the order of Definition 11 preserves the 
properties/inference rules earlier in the order. 

Proof. First, note that all operations are monotonic in the sense that they 
may generate new conclusions, but preserve all existing ones. As the (emb)and 
(ref)inference rules have empty premisses, monotonocity explains the correspond- 
ing rows in the following table, which displays vertically the property which is to 
be preserved under closing with respect to the horizontally indicated inference 
mode. 





(emb) (comp) (ref) (sym) (trans) 


(emb) 


X 


mon 


mon 


mon 


mon 


(comp) 


X 


X 


(ref) 


(sym) 


1 (trans) 


(ref) 


X 


X 


X 


mon 


mon 


(sym) 


X 


X 


X 


X 


(trans) 


(trans) 


X 


X 


X 


X 


X 



No closures are taken after (trans), which explains the last row. Preservation in 
the (comp)- and (sym)-rows follows by easy structural manipulations, using the 
inference rule given in the table in the end. For instance, the proof that (comp) 
is preserved under (sym) employs (sym) as final rule, as shown in Example 17. 
The other entries are dealt with in an analogous way. □ 

Remark 2. Alternatively, one could permute any two consecutive inference rule 
in a derivation which are in the ‘wrong’ order. One easily shows that permutation 
is always possible, that the process terminates (use e.g. recursive path orders), 
and that the resulting derivation (the normal form) is a conversion. 
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Theorem 2 (Logicality), h s S t iff s S{-^) t, for specifications S. 

Proof. (=i>) It suffices to verify that the term algebra T{E,X) with relation 
5(^) constitutes a relational model. It follows directly from Proposition 2. 
(■t=) Trivial, since 5(^) is constructed by successively closing under the in- 
ference rules which are also part of the sub-equational specification S. □ 

As a final application combining the Birkhoff and Logicality theorems con- 
sider the following result due to Zantema [1, Theorem 6.2.2]: 

Theorem 3. A TRS is terminating if and only if it admits a compatible well- 
founded monotone algebra. 

Proof. View the TRS as a positive ordering specifications. From the above we 
then have that is sound and complete w.r.t. positively ordered models. If the 
order is required to be well-founded such models coincide with compatible well- 
founded monotone algebras. Hence the if-direction follows from the existence 
of such a model by soundness. The only-if direction follows by the relational 
term model construction, and the observation made above that quotienting a 
terminating relation does nothing. 

Note that for this example to work it was necessary to drop (ref), i.e. one could 
work with neither equational nor rewriting logic. Also, building-in transitivity 
in the order of a monotone algebra would not have been necessary; working 
with a terminating relation instead would be fine as well. More generally, often 
a big step semantics can easily be replaced by a small step semantics without 
problems. 

5 Conclusion 

We have given a uniform presentation of Birkhoff-style sound- and completeness 
results for various sub-equational logics. Moreover, we have given a uniform 
proof of logicality of rewriting for each of them. Although the results are not 
very surprising, we have not seen such a uniform presentation before. Moreover 
we do think the resulting presentation is elegant and the analysis required and 
performed is useful. 
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Abstract. Restricting polymorphism to values is now the standard way to obtain 
soundness in ML-like programming languages with imperative features. While 
this solution has undeniable advantages over previous approaches, it forbids poly- 
morphism in many cases where it would be sound. We use a subtyping based 
approach to recover part of this lost polymorphism, without changing the type 
algebra itself, and this has significant applications. 



1 Introduction 

Restricting polymorphism to values, as Wright suggested [1], is now the standard way 
to obtain soundness in ML-like programming languages with imperative features. The 
long version of this paper explains in detail how this conclusion was reached [2]. 
This solution’s main advantages are its utter simplicity (only the generalization rule 
is changed from the original Hindley-Milner type system), and the fact it avoids distin- 
guishing between applicative and imperative type variables, giving identical signatures 
to pure and imperative functions. This property is sometimes described as implementa- 
tion abstraction. 

Of course, this solution is sometimes more restrictive than previous ones. In partic- 
ular, all previous solutions, both those based on weak type variables [3,4,5, 6] and those 
based on more refined approaches using effects [7] or closure typing [8,9], were con- 
servative: typing was only restricted for functions actually constructing imperative data. 
The value restriction is not conservative: by assuming that all functions may be imper- 
ative, lots of polymorphism is lost. However, this extra polymorphism appeared to be 
of limited practical use, and experiments have shown that the changes needed to adapt 
ML programs typechecked using stronger type systems to the value only polymorphism 
type system were negligible. 

Almost ten years after the feat, it might be useful to check whether this is still true. 
Programs written ten years ago were not handicapped by the value restriction, but what 
about programs we write now, or programs we will write in the future? 

In his paper, Wright considers 3 cases of let-bindings where the value restriction 
causes a loss of polymorphism. 

1 . Expressions that never return. They do not appear to be really a problem, but he 
remarks that in the specific case of Va.a, it would be sound to keep the stronger 
type. 
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2. Expressions that compute polymorphic procedures. 

This amounts to a partial application. Analysis of existing code showed that their 
evaluation was almost always purely applicative, and as a result one could recover 
the polymorphism through eta-expansion of the whole expression, except when the 
returned procedure is itself embedded in a data structure. 

3. Expressions that return polymorphic data structures. A typical example is an ex- 
pression returning always the empty list. It should be given the polymorphic type 
a list, but this is not possible under the value restriction if the expression has to 
be evaluated. 

Of these 3 cases, the last one, together with the data-structure case of the second 
one, are most problematic: there is no workaround to recover the lost polymorphism, 
short of recomputing the data structure at each use. This seemed to be a minor problem, 
because existing code made little use of this kind of polymorphism inside a data struc- 
ture. However we can think of a number of cases where this polymorphism is expected, 
sometimes as a consequence of extensions to the type system. 

1 . Constructor and accessor functions. While algebraic datatype constructors and pat- 
tern matching are handled specially by the type system, and can be given a poly- 
morphic type, as soon as we define functions for construction or access, the poly- 
morphism is lost. The consequence is particularly bad for abstract datatypes and 
objects [10], as one can only construct them through functions, meaning that they 
can never hold polymorphic values. 

2. Polymorphic variants [11]. By nature, a polymorphic variant is a polymorphic data 
structure, which can be seen as a member of many different variant types. If it is 
returned by a function, or contains a computation in its argument, it looses this 
polymorphism. 

3. Semi-explicit polymorphism [12]. This mechanism allows to keep principality of 
type-checking in the presence of first-class polymorphism. This is done through 
adding type variable markers to first-class polymorphic types, and checking their 
polymorphism. Unfortunately, value restriction looses this polymorphism. A work- 
around did exist, but the resulting type system was only “weakly” principal. 

We will review these cases, and show how the value restriction can be relaxed a 
little, just enough for many of these problems to be leveled. As a result, we propose a 
new type system for ML, with relaxed value restriction, that is strictly more expressive 
(it types more programs) than ML with the usual value restriction. 

The starting point is very similar to the original observation about Va.a : in some 
cases, polymorphic types are too generic to contain any value. As such they can only 
describe empty collections, and it is sound to allow their generalization. 

Our basic idea is to use the structural rules of subtyping to recover this polymor- 
phism: by subsumption, if a type appears at a covariant position inside the type of a 
value, it shall be safe to replace it with any of its supertypes. Erom a set-theoretic point 
of view, if this type is not inhabited, then it is a subtype of all other types (they all con- 
tain the empty set). If it can be replaced by any type, then we can make it a polymorphic 
variable. This basically means that, even for side-effecting expressions, it is sound to 
generalize non-contravariant type variables. 
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Unfortunately, this model-based reasoning cannot be translated into a direct proof: 
we are aware of no set theoretic model of ML extended with references. Neither can we 
use a direct syntactic proof, as our system, while being sound, does not enjoy the subject 
reduction. Nonetheless this intuition will lead us to an indirect proof, by translation into 
a stronger type system with subtyping. 

This paper is organized as follows. We first describe our approach in more detail, 
and apply it to simple cases. Then we show how it helps solving the problems described 
above. In section 4 we answer some concerns. In section 5 we formalize our language 
and type system, and prove its soundness using semantic types in section 6, before 
concluding. More details and extra proofs can be found in the accompanying technical 
report [2]. 



2 Polymorphism from Subtyping 

Before entering into the details of our method, let us define our intent. 

We follow the value restriction, and keep its principles: simplicity and abstraction. 
That is, we do not distinguish at the syntactic level between applicative and imperative 
type variables; neither do we introduce different points of quantification, as in rank-1 
polymorphism [13]. All type variables in any function type are to be seen as imperative: 
by default, they become non-generalizable in the let-binding of a non-value {i.e. a term 
containing a function application), on a purely syntactical criterion. 

However we can analyze the semantic properties of types, independently of the im- 
plementation. By distinguishing between covariant and contravariant variables in types 
we are able to partially lift this restriction when generalizing: as before, variables with 
contravariant occurrences in the type of an expansive expression cannot be generalized, 
but variables with only covariant occurrences can be generalized. 

The argument goes as follows. We introduce a new type constructor, zero , which 
is kept empty. We choose to instantiate all non-contravariant variables in let-bound ex- 
pressions by zero . In a next step we coerce the type of the let-bound variable to a type 
where all zero ’s are replaced by (independent) fresh type variables. Since the coer- 
cion of a variable is a value, in this step we are no longer limited by the value restriction, 
and these type variables can be generalized. 

To make explanations clear, we will present our hrst two examples following the 
same pattern: hrst give the non-generalizable type scheme as by the value restriction 
(typed by Objective Caml 3.06 [14]), then obtain a generalized version by explicit sub- 
typing. However, as explained in the introduction, our real intent is to provide a replace- 
ment for the usual value restriction, so we will only give the generalized version — as 
Objective Caml 3.07 does — , in subsequent examples. Here is our hrst example. 

let 1 = 

let r = ref [] in !r 
vaJ 1 : '_a list = [] 

The type variable ' _a is not generalized: it will be instantiated when used, and hxed 
afterwards. This basically means that 1 is now of a hxed type, and cannot be used in 
polymorphic contexts anymore. 
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y-(rref) ^ FTV{r) (n ^ rz) = FTV(ri) U V (rz) 

V“(r list) = l/“(r) X rz) = (ri) U (rz) 



Fig. 1. Dangerous variables 



Our idea is to recover polymorphism through subtyping. 

let 1 = (1 : zero list :> 'a list) 
vai 1 : 'a list = [] 

A coercion (e : ti :> tz) makes sure that e has type ti, and that ri is a subtype of rz. 
Then, it can safely be seen as having type rz. Since 1 is a value, and the coercion of a 
value is also a value, this is a value binding, and the new ' a in the type of the coerced 
term can be generalized. 

Why is it sound? Since we assigned an empty list to r, and returned its contents 
without modification, 1 can only be the empty list; as such it can safely be assigned a 
polymorphic type. 

Comparing with conservative type systems, Leroy’s closure-based typing [8] would 
indeed infer the same polymorphic typing, but Tofte’s imperative type variables [3] 
would not: since the result is not a closure, with Leroy’s approach the fact [ ] went 
through a reference cell doesn’t matter; however, Tofte’s type system would force its 
type to be imperative, precluding any further generalization when used inside a non- 
value binding. 

The power of this approach is even more apparent with function types, 
let f = 

let r = ref [] in fun () -> !r 
vai f : unit -> '_a list 

which we can coerce again 

let f = (f ; unit -> zero list :> unit -> 'a list) 
vai f : unit -> 'a list 

This result may look more surprising, as actually r is kept in the closure of f . But since 
there is no way to modify its contents, f can only return the empty list. This time, even 
Leroy’s closure typing and Talpin&Jouvelot’s effect typing [7] cannot meet the mark. 

This reasoning holds as long as a variable does not appear in a contravariant posi- 
tion. Yet, for type inference reasons we explain in section 5, we define a set of dangerous 
variables (figure 1) including all variables appearing on the left of an arrow, which is 
more restrictive than simple covariance. In a non-value binding, we will generalize all 
local variables except those in V~{t), assuming the type before generalization is r. 
This definition is less general than subtyping, as a covariant type variable with multiple 
occurences will be kept shared. For instance, subtyping would allow ( ' _a * ' _a ) 
list to be coerced to ('a * 'b) list, but type inference will only give the less 
general ('a * 'a) list. 

Of course, our approach cannot recover all the polymorphism lost by the value re- 
striction. Consider for instance the partial application of map to the identity function. 
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let map_id = List. map (fun x -> x) 
val map_id : '_a list -> '_a list 

Since ' _a also appears in a contravariant position, there is no way this partial applica- 
tion can be made polymorphic. Like with the strict value restriction, we would have to 
eta-expand to obtain a polymorphic type. 

However, the relaxed value restriction becomes useful if we fully apply map, a case 
where eta-expansion cannot be used. 

let 1 = List. map (fun id -> id) [] 
val 1 ; 'a list 

Note that all the examples presented in this section cannot be handled by rank-1 
polymorphism. This is not necessarily the case for examples in the next section, but this 
suggests that improvements by both methods are largely orthogonal. 

While our improvements are always conceptually related to the notion of empty 
container, we will see in the following examples that it can show up in many flavors, 
and that in some cases we are talking about concrete values, rather than empty ones. 

3 Application Examples 

In this section, we give examples of the different problems described in the introduction, 
and show how we improve their typings. 

3.1 Constructor and Accessor Functions 

In ML, we can construct values with data constructors and extract them with pattern 
matching. 

let empty2 = ([],[]) 

val empty2 : 'a list * 'b list = ([], []) 

let (_,12) = empty2 
val 12 : 'a list = [] 

As you can see here, since neither operations use functions, the value restriction does 
not come in the way, and we obtain a polymorphic result. However, if we use a function 
as accessor, we loose this polymorphism. 

let 12 = snd empty2 
val 12 : '_a list = [] 

Moreover, if we dehne custom constructors, then polymorphism is lost in the original 
data itself. Here pair assists in building a Lisp-like representation of tuples. 

let pair x y = (x, (y, ())) 

val pair : 'a -> 'b -> 'a * ('b * unit) 

let empty2 ' = pair [] [] 

val empty2 ' ; '_a list * ('_b list * unit) = (..) 

The classical workaround to obtain a polymorphic type involves eta-expansion, which 
means code changes, extra computation, and is incompatible with side-effects, for in- 
stance if we were to count the number of cons-cells created. 
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If the parameters to the constructor have covariant types, then the relaxed value 
restriction solves all these problems. 

let 12 = snd empty2 
val 12 : 'a list = [] 
let empty2 ' = pair [] [] 

val empty2 ' : 'a list * ('b list * unit) = (..) 

This extra polymorphism allows one to share more values throughout a program. 

3.2 Abstract Datatypes 

This problem is made more acute by abstraction. Suppose we want to define an abstract 
datatype for bounded length lists. This can be done with the following signature: 

module type BLIST = sig 
type +'a t 

val empty : int -> 'at 
val cons : 'a -> 'a t -> 'a t 
val list : 'a t -> 'a list 
end 

module Blist : BLIST = struct 
type 'a t = int * 'a list 
let empty n = (n, []) 

let cons a (n, 1) = 

if n > 0 then (n-1, a::l) else raise (Failure "Blist . cons " ) 
let list (n, 1) = 1 
end 

The + in type + ' a t is a variance annotation, and is available in Objective Caml 
since version 3.01. It means that ' a appears only in covariant positions in the definition 
of t. This additional information was already used for explicit subtyping coercions 
(between types including objects or variants), but with our approach we can also use it 
to automatically extract more polymorphism. 

The interesting question is what happens when we use empty. Using the value 
restriction, one would obtain: 

let emptyS = Blist. empty 5 

val emptyS : '_a Blist. t = <abstract> 

Since the type variable is monomorphic, we cannot reuse this emptyS as the empty 
5-bounded list; we have to create a new empty list for each different element type. And 
this time, we cannot get the polymorphism by building the value directly from data 
constructors, as abstraction has hidden the type’s structure. 

Just as for the previous example, relaxed valued restriction solves the problem: since 
'_a is not dangerous in '_a Blist . t, we shall be able to generalize it. 

val emptyS : 'a Blist. t = <abstract> 

With the relaxed value restriction, abstract constructors can be polymorphic as long 
as their type variables are covariant inside the abstract type. 
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3.3 Object Constructors 

As one would expect from its name, Objective Caml sports object-oriented features. 
Programmers are often tempted by using classes in place of algebraic datatypes. A 
classical example is the definition of lists. 

class type ['a] list = object 
method empty : bool 
method hd : ' a 
method tl : 'a list 
end 

class ['a] nil : ['a] list = ... 

class ['a] cons a b : ['a] list = ... 

This looks all nice, until you realize that you cannot create a polymorphic empty 
list: an object constructor is seen by the type system as a function. Again, as 'a is 
covariant in 'a list, it is generalizable, and the relaxed value restriction allows a 
polymorphic type. 

let nil : 'a list = new nil 
val nil : 'a list = <obj> 

3.4 Polymorphic Variants 

Polymorphic variants [11,15] are another specific feature of Objective Caml. Their de- 
sign itself contradicts the assumption that polymorphic data structures are rare in ML 
programs: by definition a polymorphic variant can belong to any type that includes its 
tag. 

let one = ' Int 1 

val one : [> 'Int of int] = 'Int 1 
let two = 'Int (1+1) 

val two : _[> 'Int of int] = ' Int 2 

Again the value restriction gets in our way: it’s enough that the argument is not a value 
to make the variant constructor monomorphic (as shown by the in front of the type). 
And of course, any variant returned by a function will be given a monomorphic type. 
This means that in all previous examples, you can replace the empty list by any poly- 
morphic variant, and the same problem will appear. 

Again, we can use our coercion principle': 

let two = (two : ['Int of int] ;> [> 'Int of int]) 

val two : [> 'Int of int] = 'Int 2 

This makes using variants in multiple contexts much easier. Polymorphic variants 
profit considerably from this improvement. One would like to see them simply as the 
dual of polymorphic records (or objects), but the value restriction has broken the duality. 

' zero amounts here to an empty variant type, and if we show the internal row extension vari- 
ables the coercion would be ( two : ['Int of int I zero] :> ['Int of int | 'a]), mean- 
ing that in one we case we allow no other constructor, and in the other case we allow any other 
constmctor. 
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For polymorphic records, it is usually enough to have polymorphism of functions that 
accept a record, but for polymorphic variants the dual would be polymorphism of vari- 
ants themselves, including results of computations, which the value restriction did not 
allow. While Objective Caml allowed polymorphism of functions that accept a variant, 
there were still many cases where one had to use explicit subtyping, as the same value 
could not be used in different contexts by polymorphism alone. For instance consider 
the following program: 

vai all_results : 

[ 'Bool of bool I 'Float of float / 'Int of int] list ref 
val num_results : [ 'Float of float / 'Int of int] list ref 
let div X y = 

if X mod Y = 0 then 'Int(x/y) else 'Float (float x/. float y) 
vai div ; int -> int -> [> 'Float of float / 'Int of int] 
let comp X y = 

let z = div X y in 
all_results := z :: ! all_results ; 
num_results := z :: !num_results 
vai comp : int -> int -> unit 

Since all_results and num_results are toplevel references, their types must be 
ground. With the strict value restriction, z would be given a monomorphic type, which 
would have to be equal to the types of both references. Since the references have differ- 
ent types, this is impossible. With the relaxed value restriction, z is given a polymorphic 
type, and distinct instances can be equal to the two reference types. 

3.5 Semi-explicit Polymorphism 

Since version 3.05, Objective Caml also includes an implementation of semi-explicit 
polymorphism [12], which allows the definition of polymorphic methods in objects. 

The basic idea of semi-explicit polymorphism is to allow universal quantification 
anywhere in types (not only in the prefix), buf to restrict instantiation of these variables 
to cases where the first-class polymorphism is known at the instantiation point. To ob- 
tain a principal notion of knowledge, types containing quantifiers are marked by type 
variables (which are only used as markers), and a quantified fype can only be instanti- 
ated when its marker variable is generalizable. Explicit type annotations can be used to 
force markers to be polymorphic. 

We will not explain here in detail how this system works, but the base line is that 
inferred polymorphism can be used to enforce principality. While this idea works very 
well with the original Hindley-Milner type system, problems appear with the value 
restriction. 

We demonstrate here Objective Caml’s behavior. The marker variable e on the type 
poly*^ is hidden in the surface language. 

class poly : object method id : 'a. 'a -> 'a end 

let f (x : poly) = (x#id 1, x#id true) 

vai f : poly -> int * bool = <fun> 

let h () = let X = new poly in (x#id 1, x#id true) 

vai h : unit -> int * bool = <fun> 
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f is a valid use of polymorphism: the annotation is on the binding of x and can be 
propagated to all its uses, i.e. the type of x is Ve. poly‘s. But h would not be accepted 
under the strict value restriction, because new poly is not a value, so that the type 
poly*^ of X is not generalizable. Since refusing cases like h would greatly reduce the 
interest of type inference, it was actually accepted, arguing that markers have no impact 
on soundness. A system allowing this is formalized in [12], yet it replaces full blown 
principality by a notion of principality among maximal derivations, which is a weaker 
property. 

By using our scheme of generalizing type variables that do not appear in dangerous 
positions, we can recover full principality, with all its theoretical advantages, and accept 
h “officially”. 

Note also that since these markers may appear in types that otherwise have no 
free type variables, this boosts the number of data structures containing polymorphic 
(marker) variables. That is, semi-explicit polymorphism completely invalidates the as- 
sumption that polymorphic values that are not functions are rare and not essential to 
ML programming. 



4 Concerns 



This section addresses some natural concerns about the relaxed value restriction. 



4.1 Typing Power and Usefulness 

A first question is how powerful the relaxed value restriction is, compared to the value 
restriction and other known systems, and whether its improvements are genuinely use- 
ful or not. If we considered only benchmarks proposed in the literature [7,8], we would 
come to the conclusion that the relaxed value restriction adds no power: its results ex- 
actly matches those of the strict value restriction. This is because all examples in the 
literature are only concerned with polymorphic procedures, not polymorphic data. 

In the previous section we have given a fair number of examples handling polymor- 
phic data. They demonstrate the additional power of our system. Compared with system 
predating the value restriction, we are in general less powerful, with some exceptions as 
shown in section 2. However, in practice implementation abstraction matters more than 
pure typing power, and on this side we keep the good properties of the value restriction. 

Our examples with constructor functions and abstract datatypes were expressible in 
systems predating the value restriction, and are refused by the strict value restriction. 
This makes one wonder why this didn’t cause more problems during the transition. 
These idioms were apparently rarely used then. However, the author believes he is not 
alone in having experienced exactly those problems on newly written code. And there 
have been explicit reports of polymorphism problems with objects and variants, justi- 
fying the need for such an improvement. 
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4.2 Abstraction 

While we claim that our scheme is not breaking implementation abstraction, one may 
remark that we require variance annotations for abstract datatype definitions. Aren’t 
these annotations breaking abstraction? 

Clearly, specifying a variance reduces the generality of an interface, and as such it 
is reducing its abstraction degree. However we claim that this does not mean that we 
are breaking implementation abstraction. We give here a concrete example, defining 
covariant vectors on top of nonvariant mutable arrays. 

type +'a vector = {get: int -> 'a; length: int} 
let make len f = 

let arr = if len = 0 then [| |] else Array. create len (f 0) in 
for i = 1 to len-1 do arr. (i) <- f i done; 

{get=Array . get arr; length=len} 
val make : int -> (int -> 'a) -> 'a vector 

let map f vect = make vect. length (fun i -> f (vect.get i)) 
val map : ('a -> 'b) -> 'a vector -> 'h vector 

What this example demonstrates, is that variance is not limited by the implementation. 
By changing the superficial definition, while keeping the same internal implementa- 
tion, we may improve the variance of a datatype. This situation is to be compared with 
imperative type variables, or equality type variables, whose specificity must be propa- 
gated through any definition they are used in, making it impossible to abstract from the 
implementation. 

To be fully honest, there are cases where an overspecified variance results in mak- 
ing some implementations impossible. But this should be seen as a problem of bad 
design, and the above example gives a natural criterion for proper variance of an ab- 
stract datatype: this should at most be the variance of the minimal set of operations 
which cannot be defined without access to the implementation. 

4.3 Ease of Use 

Does the introduction of variance make the language harder to use? There are actually 
two problems: understanding the new typing rule, and having to write variance annota- 
tions for abstract datatypes. 

Seeing that the value restriction itself is rather hard to grasp — notwithstanding the 
simplicity of its definition — , one might argue that any improvement of polymorphism 
(when it does not involve changes in the type algebra itself) is good, as it is going to 
avoid some non-intuitive type errors. Moreover, once you understand the typing algo- 
rithm, the relaxed value restriction introduces no leap in complexity. 

More disturbing may be the need for variance annotations. For Objective Caml, they 
were already there, as the language allows explicit subtyping. So we are just exploiting 
an existing feature. But even if it were to be newly added, keep in mind that explicit 
annotations are only needed for abstract datatype definitions, and that there is a good 
semantic criterion as to what they should be. Of course this information is only optional: 
at worst, we are still as powerful as the value restriction. 
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4.4 Compilation 

A last concern is with compilation, in particular for compilers using type information 
during compilation or at runtime. These compilers often involve a translation to an 
explicitly typed second-order lambda-calculus, which does not seem to be a good target 
for our system since, as we will see in the next sections, our type soundness seems to 
require subtyping. 

A first remark is that the problem lies not so much in our approach as in the in- 
adequation between polymorphic data structures and second-order lambda-calculus. 
While there can be no value whose type is a covariant variable inside the data struc- 
ture, second-order lambda-calculus would have us pass its (useless) type around. 

The answer is simple enough: we just have to extend the target type system with the 
needed subtyping, knowing that this will not impact later stages of compilation as there 
are no values of type zero anyway. To gain full proht of our remark, we may even 
replace all purely covariant type variables with zero — in value bindings too — , so as 
to minimize the type information passed around. 

While zero is not a problem, compilation is one of the reasons we have stopped 
short of exploiting the dual observation: that assuming a “type of all values” top, the 
monomorphic type variables that appear only in contravariant positions are generaliz- 
able too. This would have had an extra advantage: this should alleviate the principality 
problem, which had us restrict generalizability to type variables of rank 0. Only vari- 
ables that appear both in covariant and contravariant position would not be generaliz- 
able. However, the existence of top would require all values to be represented in a 
uniform way. This is just what type-passing implementations want to avoid. Actually, 
even Objective Caml, which has only a very conservative use of type information, does 
not satisfy this property^. 

5 Formalization and Type System 

In this section we fully formalize our language, and propose a type system where the 
extra polymorphism described in previous examples is recovered automatically (with- 
out the need for explicit coercions). Yet this type system, which we call the relaxed 
value restriction, enjoys the principal type property. 

We base ourselves on Wright and Felleisen’s formalization of Reference ML [16]. 
For our results to be meaningful, we need to handle more varied data, so we also add 
pairs and lists, as they do not incur any difficulty in typing. 

Expressions distinguish between values and non-values. The store is introduced by 
the pO.e binder and is handled explicitly. Two kinds of contexts are defined for reduction 
rules: i?-contexts, used in store operations, and El-contexts, in evaluation. 

^ The function Obj . repr can be seen as a coercion to top (aka Obj . t), but it is unsafe, 
let 1 = Array. create 2 (Obj. repr 1.0) 
val 1 : Obj.t array = [ j<abstr>; <abstr>j] 

1.(1) <- Obj .repr 1 
Segmentation fault 

In one sentence: arrays of float values have a special representation, and operations on ar- 
rays are not semantically correct when float and int values are mixed — which is of course 
impossible using the existing type system and safe operations. 
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e ::= t; | ei 62 | let x = e\ in 62 | pO.e 

t; ::= X I Y I Ax.e | ref | ! | := | := t; | {v, tt) | tti | 7 T2 | nil | cons v \ uncons v v 

0 ::= {(x,x)}* 

R ::= [] \ R e \ V R \ let x = i? in e 

::= [] I e I t; I let X = in e I p9.E 

As in Reference ML, both := and := v are values, reflecting the fact := can only be 
reduced when given two arguments. 

Reduction rules are given in figure 2. They are those of Reference ML, with a few 
innocuous additions. We define one-step reduction as E[e] E[e'] whenever e — > e', 
and multi-step reduction as ei ^ e„ whenever ei — > 62 . . . — *■ e„. Reduction does not 
produce badly-formed expressions. 

Lemma 1. If e is a well-formed expression (i.e. no non-value appears at a value posi- 
tion ), and e —>• e! , then e! is well-formed. 

Types are the usual monotypes and polytypes. 

r ::= a \ r ref | t x r | r list 
<j ::= T I Va.r 

An instantiation order is defined on polytypes by Va.r V^.r' iff j3C\FTV (Va.r) = 

0 and there is a vector f of monotypes such that [f /a]r = r'. 

We type this language using typing rules in figure 3. Those rules are again taken 
from Reference ML, assuming all type variables to be imperative (which is equivalent to 
applying the value restriction, c/[l] page 6 ). The only exception is the LEXe rule, which 
generalizes some variables. In the value case, Close{Ti, E) = \/ FTV {ti)\FTV (E) .ti 
as usual, but in the non-value case we still generalize safe variables: CovClose (ti,E) = 
y FTV { ti) \ V~{ti) \ FTV{E).ti, with V~ the set of dangerous variables defined in 
figure 1. The definition of V~ captures more variables than the usual definition of con- 
travariant occurrences. We deem dangerous all occurrences appearing in a contravariant 
branch of a type. While this is not necessary to ensure type soundness, we need it to 
keep principality of type inference. For instance, consider the following function. 

let / = let r = ref nil in Afc.Y (A/./) !r 

As the type of Y (A/./) is Va/3.a ^ /3, we expect the principal type of / to be V/ 3 . 7 ^ 
/3, with 7 a non generalizable variable. However, if we were to generalize covariant 
variables at ranks higher than 0, then yp6.{S ^ 7 ) — > /3 would be another acceptable 
type for /, and neither of the two is an instance of the other, i.e. we would have lost 
principality. 

As we explained in section 3.5, rule LEXe does not unshare covariant type variables, 
as it would be sound to do, but only allows for more type variables to be generalized. 
Unsharing variables would break even the partial subject reduction we define lower. 
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Fig. 2. Reduction rules 



Var App 

r{x) >- T r\- ei : T2 ^ Tl r \- €2 ■■ T2 

r \- X ■. T r \- ei 62 ■ Tl 



Abs 

r[x ri] h e : T 2 
r h \x.e ■■ Tl —> T2 



Let„ 

r \- V Tl r[x H- » Close{Ti, -T)] h e : T 2 
-T h let a: = w in e : T2 
LETe 

-T h 6i : Tl r[x H- » CovClose(Ti, F)] h 62 : T2 
_r h let a; = ei in 62 : t2 



Pair 

G h Ui : Tl r \- V2 : T2 
r \- (vi,V2) : n X T2 

Cons 

7 ^ h u : r X r list 
r h COns(u) : r list 



Rho 

r[xj H- > Tj ref]" h e : r F[xj 1 — > Tj ref]" \- Vi ■. ti {1 < i < n) 
r \- p{xi,Vl) . . . {Xn,Vn).e I T 



Axioms 

r I- Y : ((n ^ T2) ^ Tl ^ T2) ^ Tl ^ T2 

r h ref : T ^ T ref _T h ! : r ref ^ r _T h := : r ref t ^ t 

_T h 7 Ti : ri X T2 — > ri 7 ^ h 7 T 2 : ti X T2 — > T2 T’ h nil : r list 

r h uncons : (ri list — > T2) — > (n X Tl list ^ T2) — > Tl list ^ T2 



Fig. 3. Typing rules 



B-SUB 

r \= 6 ■. t t <t' 
r\=e-.t' 



b-Let„ 

(Vf £ s) r \= V -.t r[x H- » s] ]= e : f' 
r ]= let a; = V in e : f' 



b-Abs 

r[x tfi] ]= e : f2 
r \= Xx.e ■■ tl —> t2 



b-Var 
t £ r{x) 
r \= X t 

b-Rho 

r[xj 



b-App 
r ^ ei : T 2 



tl r \= 62 '■ t2 



B-LETe 
7^ ^ ei : fi 



r[x l-» Tfl] \= 62 ■t2 



r \= 61 62 '■ tl 

T(fj ref)]r h e : f F[xj 



r ]= let r = ei in 62 : fa 
^tj ref)]? \= Vi :ti {1 < i < n) 



r \= p{xi,Vl) . . . {Xn,Vn)-6 : t 



Fig. 4. Typing rules for B(T) 
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We include the Rho typing rule for completeness, but we cannot use it to obtain full 
subject reduction. We can see this on the following example^. 

let / = (let r = ref nil in Ax.lr) in (cons(nil, / nil), cons(ref nil, / nil)) 

-> p(r, nil).(cons(nil, (Ax.lr) nil),cons(ref nil, (Ax.lr) nil)) 



In the first line, / can be given the polymorphic type Va. (3 list — > a list, with (3 
a non-generalized type variable. When we apply / to nil we may get any list. The type 
of the whole expression is (ri list list x T 2 list ref list). However, after 
reduction, r can only be given a monomorphic type, and its two occurrences appear in 
incompatible type contexts. 

In the absence of direct subject reduction, we must prove type soundness in an 
indirect way. Following our intuition, we could recover subject reduction in a stronger 
system, by adding a subsumption rule. 



The: r[zero/a] 
T h e : r 



allV (r) = 0 



Rather than doing this directly, and bearing the burden of proof, we will do this in the 
next section by translating our derivations into a known type system validating this rule. 
We believe that an appropriate form of subsumption (direct or indirect) is essential to 
proofs of subject reduction for type systems validating our LeTe rule. 

On the other hand, principality is a static property of terms, and we can prove it 
easily by trivially modifying the inference algorithm W, using CovClose in place of 
Close for non-values. This is clearly sound: this is our rule. This is also complete: 
CovClose is monotonic with respect to the instantiation order )^, that is, for any type 
substitution S', we have CovClose{r, F) >- CovClose{S{T), S{F)). 



Proposition 1 (principality). If, for a given pair (T, e) there is a tq such that The: 
To is derivable, then there exists a <j such that for any t, F \- e : t iff a t. 



We can also verify a partial form of subject reduction, limited to non side-effecting 
reductions, but allowing those reductions to happen anywhere in a term. While insuffi- 
cient to prove type soundness, this property is useful to reason about program transfor- 
mations. 



C ::= [] I C e I e C I let a; = T in e I let a; = e in C 
I p6{x,C).e I p9.C I Aa;.C | (C,u) | (u,C) 



Proposition 2 (partial subject reduction). Non side-effecting reductions, i.e. rules 
(Pv), (let), (V'), (7Ti), (urii) preserve typing: for any context C, if F h C[e] : r and 
e —ff e', then F h C[e'] : r. 

The proof can be easily transposed from any proof of subject reduction for applicative 
ML. We only need to verify that the substitution lemma still holds in presence of our 
distinction between Let„ and LeTe. 

Lemma 2 (substitution). If F[x a\] h e : t and F \- v : t\ and Close{T\, F) >- 

u\, then F h e[t;/a;] : r. 

^ For sake of conciseness we use pairs of expressions, rather than an expanded form where pairs 
contain only values; and we write ei ; 62 as a shorthand for let i = ei in 62 (i fresh). This has 
no impact on typing. 
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6 Type Soundness 

Rather than extending our own type system with subsumption, we will reuse one that 
already has the required combination of polymorphism, imperative operations, and sub- 
typing. A good choice is Pottier’s B(T) [17], as its typing rules closely match ours. 
B(T) was originally developed as an intermediate step in the proof of type soundness 
for HM(X), a constraint-based polymorphic type system [18]. B(T) is particular by 
its extensional approach to polymorphism: polytypes are not expressed syntactically, 
but as (possibly infinite) sets of ground monotypes. For us, its main advantages are its 
simplicity (no need to introduce constraints as in HM(X)), and the directness of the 
translation of typing derivations. 

We give here a condensed account of the definition of B(T), which should be suf- 
ficient to understand how a typing derivation in our system can be mapped to a typing 
derivation in an instance of B(T). 

The T in B(T) represents a universe of monotypes, equipped with a subtyping re- 
lation <, serving as parameter to the type system. Monotypes in T are denoted by t. 
— > should be a total function from T x T into T, such that ti ^ t 2 < ^ t '2 im- 

plies t[ < h and t 2 < t' 2 - ref should be a total function from T to T, such that 
t ref < t' ref implies t = t' . Moreover ti ^ t 2 < t ref and t ref < ^ ^2 

should both be false for any t, ti, t 2 in T. Polytypes s are upward-closed subsets of T 
(i.e. if t G s and t <t' then t' G s). We write for the upward closure of a monotype 
(the set of all its supertypes). 

The terms and reduction rules in B(T) are identical to those in our system (exclud- 
ing pairs and lists). While Pottier’s presentation uses a different syntax for representing 
and updating the store, the presentations are equivalent, ours requiring only more re- 
duction steps. We will stick to our presentation. 

Typing judgments are written F \= e ■. t with F a polytype environments (mapping 
identifiers to upward-closed sets of monotypes) and t a monotype. Typing rules'^ are 
given in figure 4. They are very similar to ours, you just have to transpose all t’s into 
f’s and all h into \=. The only changes are that b-LeTe is now monomorphic (this is 
the strict value restriction), subsumption b-Sub is added, and polymorphism is handled 
semantically in b-Var and b-Let. Axioms for references are included. 

The following theorem is proved in [17], section 3, for any (T, <) satisfying the 
above requirements. 

Theorem 1 (Subject Reduction). If e e', where e, e! are closed, then F \= e •. t 
implies F \= e' : t. 

For our purpose, we choose T as the set of all types generated by the type construc- 
tors zero, int, ref, X, list and the set of all type variables {a,/3, . . .}. The 
variables are introduced here as type constants, to ease the translation, but they are un- 
related to polymorphism: there is no notion of variable quantification in B(T). zero is 
an extra type constructor, which need not be included in our original language. The sub- 
typing relation is defined as zero < t and t < t for any t in T, and extended through 

In Pottier’s presentation, a judgment writes F, M ^ e : t; we have merged F and M (M only 
mapping to monotypes), as our syntax for references permits. B-Rho merges B-SXORE and 
B-CONF from the original presentation. 
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constructors, all covariant in their parameters, except ref which is non- variant, and 
— » which is contravariant in its first parameter and covariant in its second one. This 
conforms to the requirements for B(T), meaning that subject reduction holds in the 
resulting system. We also extend the language, reduction and typing rules with Pair, 
Cons and Axioms about Y, pairs and lists. Extending subject reduction to these fea- 
tures presents no challenge; the concerned reader is invited to check this (and other 
details of formalization), on the remarkably short proof in [17]. 

The progress lemma depends more directly on the syntax of expressions, and we 
cannot reuse directly Pother’s proof. However, our reduction and typing rules are basi- 
cally the same as in [16]. 

Lemma 3 (Progress). For any closed e, if for all e! such that e ^ e' there is F and t 
such that r \= e' : t, then reducing e either diverges or leads to a value. 

Combining the above subject reduction and progress, our instance of B(T) is sound. 

We present now the translation itself. First we must be able to translate each compo- 
nent of a typing judgment. The expression part is left unchanged. Types are translated 
under a substitution ^ : V ^ T. 

= ^{a) |ri X Ti]C = |ti]C X |t2]C 

|t ref]^ = |r]^ ref |r listjc = |r]C list 

This translation is extended to polytypes appearing in typing environments. 

|Vai . . .a„.r]^ = {t \ (fi, ...,tn)e T", |T](^[ai in]) < t} 



Before going on to translate full derivations, we state a lemma about the single 
subsumption step we need. 

Lemma 4. Let a be a set of type variables that appear only covariantly in t\. Let ^ be 
any translation substitution. Then |Va.ri]^ = TI'^i](C[^ zero]). 

Finally the derivation is translated by induction on its structure, transforming F h 
e : T into |T]^ \= e : |r]^ for any 

- if the last rule applied is LEXe and CovClose{Ti, F) = Va.Ti then it is translated 
into 

n' h ei : [nlC' jr[x ^ Va.Ti]lC h 62 : s 

[riehletx=eiine 2 :lT 2 K 

where i— > zero]. By Lemma 4, we have |Va.Ti]^ = this is 

an instance of rule b-LeTe. Note also that |T]^' = |L]^ as at C FTV (F) = 0. 

- if the last rule applied is Let„ and Closeiji , F) = Vai . . . then it becomes 



jrw 1 = 
me 



(b-Sub) 



[riehietx = 



|rlg[x 1-^ a] ^ e : 
v'me: |t2]C 



(b-Let„) 



where s = |Vai . . . t ranges over all elements of s, and ^ = '?[ai 

ti, . . . , q;„ 1-^ t„] is such that |ri]^' < t. Here again {Fj^ = 
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- if the last rule applied is Var, it becomes 



|rl^ e ([-^10(3^) 
h 2: : [r]^ 



(b-Var). 



- other cases are trivial induction. 



From this construction we can obtain the following proposition. 

Proposition 3. If F e : t is derivable in ML with the relaxed value restriction, then 
H 6 ■ is derivable in B{T) for any 

Now, suppose that we restrict ourselves to closed expressions whose types do not 
contain references nor function types. Normal forms of such expressions can only be 
data of the form: 

d ::= nil I (d,d) \ COns d 

For such normal forms, type derivations in B(T) coincide with our system. 

From this and type soundness for our instance of B(T) we can deduce the type 
soundness of ML with the relaxed value restriction, as stated below. 

Theorem 2 (Type Soundness). 7/'0 F e : <5 with S any type of the form 5 ::= a | 5 x <5 | 
5 list, then reducing e either diverges or leads to a normal form d, and 0 h d : (5. 



7 Conclusion 

Thanks to a small observation on the relation between polymorphism and subtyp- 
ing — that zero in a covariant position is equivalent to a universally quantified type 
variable — , we have been able to smooth some of the rough edges of the value restric- 
tion, while keeping all of its advantages. This is a useful result, which has already been 
integrated in the Objective Caml 3.07 compiler. Flopefully this should make the use of 
polymorphic data structures easier. 

Notwithstanding our achievements, this paper does nothing to solve the fundamental 
problem of the value restriction, namely that by assuming all functions to be imperative, 
it is overly pessimistic. We have been able to rescue some cases that were probably not 
even considered when it was introduced. But there is no easy solution for more involved 
cases, with polymorphic function types in the data. 

The triviality of this result brings another question: why wasn’t it discovered earlier? 

Actually, this specihc use of subtyping is not new: the fact has not attracted very 
much attention, but our LeTe rule is already admissible in HM(AT). This could give yet 
another way to prove type soundness for our system: by defining it as a subsystem of a 
sufficiently feature-rich instance of FIM(X) as found in [19]. We preferred B(T) for its 
robustness, and the lightness of its dehnition and proof, but this last approach would be 
purely syntactic. 
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Abstract. Mixin modules are a notion of modules that allows cross- 
module recursion and late binding, two features missing in ML-style 
modules. They have been well defined in a call-by-name setting, but in a 
call-by-value setting, they tend to conflict with the usual static restric- 
tions on recursive definitions. Moreover, the semantics of instantiation 
has to specify an order of evaluation, which involves a difficult design 
choice. Previous proposals [14, 16] rely on the dependencies between 
components to compute a valid order of evaluation. In such systems, 
mixin module types must carry some information on the dependencies 
between their components, which makes them verbose. In this paper, 
we propose a new, simpler design for mixin modules in a call-by-value 
setting, which avoids this problem. 



1 Introduction 

1.1 The Problem 

For programming “in the large” , it is desirable that the programming language 
offers linguistic support for the decomposition and structuring of programs into 
modules. A good example of such linguistic support is the ML module system 
and its powerful notion of parameterized modules. Nevertheless, this system is 
weak on two important points. 

(Mutual recursion) Mutually recursive definitions cannot be split across sepa- 
rate modules. There are several cases where this hinders modularization [6]. 
(Modifiability) The language does not propose any mechanism for incremental 
modification of an already-defined module, similar to inheritance and over- 
riding in object-oriented languages. 

Class-based object-oriented languages provide excellent support for these two 
features. Classes are naturally mutually recursive, and inheritance and method 
overriding answer the need for modifiability. However, viewed as a module sys- 
tem, classes have two weaknesses: they do not offer a general parameterization 
mechanism (no higher-order functions on classes), and the mechanisms they offer 
to describe pre-computations (initialization of static and instance variables) lack 
generality, since a module system should allow to naturally alternate function 
definitions with computational definitions using these functions. 

Mixin modules [4] (hereafter simply called “mixins”) provide an alternative 
approach to modularity that combines some of the best aspects of classes and 
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ML-style modules. Mixins are modules with “holes” (not-yet-defined compo- 
nents), where the holes can be plugged later by composition with other mixins, 
following a late-binding semantics. However, the handling of pre-computations 
and initializations in mixins is still problematic. Most of the previous work on 
mixins, notably by Ancona and Zucca [2] and Wells and Vestergaard [20] , is bet- 
ter suited to a call-by-name evaluation strategy. This strategy makes it impossi- 
ble to trigger computations at initialization time (see Sect. 6 for more details). 
The choice of a call-by- value setting raises the following two issues. 

(Recursive definitions) Since mixin components are not necessarily functions, 
arbitrary recursive definitions can appear dynamically by composition. For 
instance, consider the following two mixins, (in an informal concrete syntax) 



mixin A = mix 
? x : int 

! y = X + 1 

end 



and 



mixin B = mix 

? y : int 

! X = y * 2 

end 



Each of these mixins declares the missing value (marking it with ?) and 
defines the other one (marking it with ! ) . The composition of A and B involves 
the mutually recursive definition x = y * 2 eind y = x + 1. 

In most call-by-value languages, recursive definitions are statically restricted, 
in order to be more efficiently implementable [3, 15], and to avoid some 
ill-founded definitions. Obviously, our system should not force language de- 
signers to abandon these properties, and thus needs guards on recursive 
definitions, at the level of both static and dynamic semantics. 

(Order of evaluation) In our system, mixins will contain arbitrary, unevaluated 
definitions, whose evaluation will be triggered by instantiation. Because these 
definitions are arbitrary, the order in which they will be evaluated matters. 
For instance, in a mixin A defining x = 0 and y = x + 1, x must be evalu- 
ated before y. Thus, the semantics of instantiation must define an order of 
evaluation. Moreover, mixins can be built by composition, so the semantics 
of composition must also take the order of definitions into account. 



From the standpoint of dynamic semantics, the second issue involves a design 
decision. From the standpoint of typing, it reduces to the first issue, since the 
existence of a valid order of evaluation is governed by the absence of invalid 
recursive definitions. 



1.2 Instantiation- Time Ordering: Flexible Mixin Modules 

The MM language of call-by-value mixins [14, 16, 12] is designed as follows. Mix- 
ins contain unordered definitions. Only at instantiation does the system com- 
pute an order for them, according to their inter-dependencies [14, 16, 12], and to 
programmer-supplied annotations that fix some bits of the final order [16, 12]. 
This solution, which we call flexible mixins is very expressive w.r.t. code reuse, 
since components can be re-ordered according to the context. However, it ap- 
pears too complex in some respects. 
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(Instantiation) In particular, instantiation is too costly, since it involves com- 
puting the strongly-connected components of a graph whose size is quadratic 
in the input term, plus a topological sort of the result. 

(Type safety) As explained above, when recursive definitions are restricted, the 
type system must prevent invalid ones. In MM , mixin types contain some 
information about the dependencies between definitions. Nevertheless, this 
makes mixin types verbose, and also over-specified, in the sense that any 
change in the dependencies between components can force the type of the 
mixin to change, which is undesirable. 

The first problem is not so annoying in the context of a module system: 
it only has to do with linking operations, and thus should not affect the over- 
all efficiency of programs. The second problem makes the proposed language 
impractical without dedicated graph support. 

1.3 Early Ordering: Rigid Mixin Modules 

In this paper, we propose a completely different approach, from scratch. We 
introduce Mix, a new language of call- by- value mixins, where mixin components 
are ordered, in a rigid way. They can be defined either as single components 
(briefly called “singles”) or as blocks of components. Blocks contain mutually 
recursive definitions, and are restricted to a certain class of values. Conversely, 
singles can contain arbitrary, non-recursive computations. Composition preserves 
the order of both of its arguments, and instantiation straightforwardly converts 
its argument into a module. 

In Mix, the components of a mixin are ordered once for all at definition time, 
so Mix is less expressive than MM. Yet, it has other advantages. First, with 
respect to side effects, annotations are no longer needed, since side effects always 
respect the syntactic order. Moreover, instantiation is less costly than in MM , 
since it runs in 0{n log n), where n is the size of the input. Concerning typing, 
mixin types have the same structure as mixins themselves: they are sequences 
of specifications, which can be either singles or blocks. They avoid the use of 
explicit graphs, which improves over MM. Compared to ML module types, 
the only differences are that the order matters and that mutually recursive 
specifications must be explicitly grouped together. Finally, the meta theory 
of Mix is much simpler than the one of MM , which makes it more likely to 
scale up to a fully- featured language. In summary, we propose Mix as a good 
trade-off between expressiveness and user-friendliness for incorporating mixins 
into a practical programming language like ML. 

The rest of the paper is organized as follows. Section 2 presents an informal 
overview of Mix by example. Section 3 formally defines Mix and its dynamic 
semantics. Section 4 defines a sound type system for Mix. Finally, sections 5 
and 6 review related and future work, respectively. The proofs are omitted from 
this paper for lack of space, but a longer version including them is available as 
a research report [13]. 
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2 Intuitions 

As a simplistic introductory example, consider a program that defines two mu- 
tually recursive functions for testing whether an integer is even or odd, and then 
tests whether 56 is even, and whether it is odd. Assume now that it is conceptu- 
ally obvious that everything concerning oddity must go into one program frag- 
ment, and everything concerning evenness must go into another, clearly distinct 
fragment. Here is how this can be done in an informal programming language 
based on Mix, with a syntax mimicking OCaml [17]. 

First, define two mixins Even and Odd as follows. 

mixin Even = mix 

recblock ? odd : int -> bool 

and ! even x = x = 0 or odd (x-1) 

! even56 = even 56 

end 

mixin Odd = mix 

recblock ? even ; int -> bool 

and ! odd x = x > 0 and even (x-1) 

! odd56 = odd 56 

end 

Each of these mixins declares the missing function (marking it 
with ?) and defines the other one (marking it with !), inside a recblock 
which delimits a recursive block. Then, outside of this block, each mixin 
performs one computation. 

In order to link them, and obtain the desired complete mixin, one composes 
Even and Odd, by writing mixin OpenNat = Odd >> Even. Intuitively, compo- 
sition connects the missing components of Even to the corresponding definitions 
of Odd, and vice versa, preserving the order of both mixins. Technically, compo- 
sition somehow passes Odd through Even, with Even acting as a filter, stopping 
the components of Odd when they match one of its own components. This filter- 
ing is governed by some rules: the components of Odd go through Even together, 
until one of them, say component c, matches some component of Even. Then, 
the components of Odd defined to the left of c are stuck at the current point. 
The other components continue their way through Even. Additionally, when two 
components match, they are merged into a single component. 

In our example, odd and even both stop at Even’s recursive block mention- 
ing them, so the two recursive blocks are merged. Further, odd56 remains is 
unmatched, so it continues until the end of Even. The obtained mixin is thus 
equivalent to 

mixin OpenNat = mix 

recblock ! even x = x = 0 or odd (x-1) 
and ! odd x = x > 0 and even (x-1) 

! even56 = even 56 
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! odd56 = odd 56 

end 

Note that composition is asymmetric. This mixin remains yet to be instanti- 
ated, in order to trigger its evaluation. This is done by writing module Nat = 
close OpenNat, which makes OpenNat into a module equivalent to 

module Nat = struct 

let rec even x = x = 0 or odd (x- 1 ) 
odd X = x > 0 and even (x-1) 
let even56 = even 56 
let odd56 = odd 56 

end 

which evaluates to the desired result. For comparison, in MM , the final evalu- 
ation order would be computed upon instantiation, instead of composition. It 
would involve a topological sort of the strongly-connected components of the de- 
pendency graph of OpenNat. Incidentally, in MM , in order to ensure that even56 
is evaluated before odd56, the definition of odd56 should better explicitly state 
it. From the standpoint of typing, explicitly grouping possibly recursive defini- 
tions together allows to get rid of dependency graphs in the types, thus greatly 
simplifying the type system. 

3 The Mix Language and Its Dynamic Semantics 

3.1 Syntax 

Pre-terms. Figure 1 defines the set of pre-terms of Mix. It distinguishes names 
X from variables x, following Harper and Lillibridge [11]. It includes a standard 
record construct {s}, where s ::= {Xi = e\ . . .Xn = e„), and selection e.X. It 
features two constructs for value binding, letrec for mutually recursive definitions, 
and let for single, non-recursive definitions. Finally, the language provides four 
mixin constructs. Basic mixins consist of structures m = (ci . . . c„), wich are 
lists of components. A component c is either a single, or a block. A single u is 
either a named declaration A > a; = •, or a definition L \> x = e, where L 
is a label. Labels can be names or the special anonymous label, written _, in 
which case the definition is also said anonymous. Finally, a block g is a list of 
singles. The other constructs are composition (ei ^ 62 ), instantiation (close e), 
and deletion of a name X, written (e|_x)- 

Terms. Proper terms are defined by restricting the set of pre-terms, as follows. 
We define Mix values u by u ::= {s’'} | (m), where s" ::= {X\ = v\ . . . A„ = 
Vn). Then The system is parameterized by the set RecExp of valid recursive 
expressions, which must contain only values, and be closed under substitution. 

Definition 1 (Terms) 

A term of Mix is a pre-term such that: records do not define the same name 
twice ; bindings do not define the same variable twice ; structures define neither 
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Fig. 1. Syntax 



the same name twice nor the same variable twice ; and the right-hand sides of 
letrec and block definitions belong to RecExp. 

The restriction of letrec and block definitions to valid recursive expressions 
both simplifies the semantics of letrec, and models the restrictions put by stan- 
dard call-by-value languages on recursive definitions [17, 19]. Typically, recursive 
definitions can only be functions. 

Records, bindings and structures are respectively considered as finite maps 
from names to terms, variables to terms, and pairs of a label and a variable to 
terms or •. Thus, given a structure m, the restriction of doni(m) (the domain of 
m) to pairs of a name and a variable can be seen as an injective finite map from 
names to variables, which we call Vo£N(m). 

Terms are considered equivalent modulo proper [20] renaming of bound vari- 
ables and modulo the order in blocks and letrec. We denote by DV(m) and DV{b) 
the sets of variables defined by m and b, respectively, and by DN{m) and DN{s) 
the sets of names defined by m and s, respectively. 



3.2 Dynamic Semantics 

The semantics of Mix is defined as a reduction relation on pre-terms in figure 2, 
using notions defined in figure 3. It is compatible with bound variable renaming 
and preserves well-formedness, so we extend it to terms. 

Figure 3 defines evaluation contexts, which enforce a deterministic, call-by- 
value strategy. We can now examine the rules, from the most interesting to the 
most standard. 
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(mi) (m2) — > (Add(mi, m2, e)) if mi 0 m2 (Compose) 

close (m'^) ^ Bind{m'^ , {Record{m‘^)}) (Close) 

(m)|_x ^ (Del{m,X)) (Delete) 

letrec 6 in e — » {a: letrec b in b{x) \ x G dom(b)}(e) (LetRec) 

let a; = u in e — > {a: H- > f }(e) (Let) 

{s”}.X ^ s”(A:) (Select) 

E[e] ^ E[e'] if e ^ e' (Context) 



Fig. 2. Reduction rules 



Composition. Rule Compose describes mixin composition. In order to be com- 
posed, the structures must be made compatible by a-conversion. Namely, we say 
that two structures mi and m2 are compatible, and write mi =c= m2, iff DV(mi)n 
FV{{m2)) = DV(m2) nFV((mi)) = 0, and for any x G DV(mi)nDV(m2), there 
exists a name X such that VofN{mi){X) = VofN{m2){X) = x. This basically 
says that both structures agree on the names of variables. 

Then, their composition (mi) 2> (m2) is Add(mi, m2, e), where Add is de- 
fined by induction on mi by 



Add(e, mi, m2) = mi, m2 

Add((mi,c),m2,m3) = Add(mi, m2, (c, m3)) 

if DN{c) n DJV(m2, m3) = 0 
Add((mi, Cl), (m^, C2, m^), m3) = Add(mi, m^, (ci C) C2, m^, m3)) 

if DN{ci) n DN{m\,m\,mz) = 0 and DJV(ci) C DN{c2) ^ 0 

Given three arguments mi, m2, m3, Add roughly works as follows. If mi is 
empty, it returns the concatenation of m2 and m3. If the last component c of mi 
defines names that are not defined in m2 or m3, then c is pushed at the head 
of m3. Finally, when the last component ci of mi defines a name also defined 
by some C2 in m2, so that m2 = {m\,C2,m\), then the third argument becomes 
(ci 0 C2, m\, m3), where ci 0 C2 is the merging of ci and C2, which is defined by 



Cl 0 C2 
(A t> a; = •) 0 c 
[<?i] ® [92] 
[A t> a; = •, qi] 0 [92] 



C2 0 Cl 

c if VbflV(c)(A) = X 

[(?!, 92] if DN{qi) n DN{q2) = % 
\i VofN{q2){X) = X 



This definition is not algorithmic, but uniquely defines the merging of two 
components, and an algorithm is easy to derive from it: one has to apply rules 
2 and 4 as long as possible, then commute the arguments and apply rules 2 and 
4 as long as possible again, and then finally apply rule 3. Technically, as soon 
as a declaration is matched, it is removed, and when two blocks have no more 
common defined names, their merging is their union. Note that initially, only 
components with common defined names are merged, but that the union takes 
place after all the common names have been reduced. 
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E :;= {s”,X = n,s} | D-X 
I let a: = □ in e 
I □ » e I u » □ 

I close □ I Di-x 



Fig. 3. Evaluation contexts 



Example 1. Assuming that Mix is extended with functions, integers and 
booleans, the mixin Even described in section 2 is written 

even = { [ Odd \> odd — •, 

Even \> even = \x.{x = 0) or odd (x — 1) ], 

Even56 \> even56 = even 56 ) 

During composition with the mixin corresponding to Odd, the component 
Odd56 traverses the whole structure to go to the rightmost position, then the 
two blocks defining Even and Odd are merged, which gives the expected block. 



Instantiation. Rule Close describes the instantiation of a complete basic mixin. 
A structure is said complete iff it does not contain declarations. We denote com- 
plete structures, components, singles, and blocks by m“, c'^, and respec- 
tively. Given a complete basic mixin {ml), instantiation first generates a series of 
bindings, following the structure of m'^, and then stores the results of named def- 
initions in a record. Technically, close (m'^) reduces to Bind{m‘^, {Record{m‘^)}) , 
where Record makes m‘^ into a record and Bind makes into a binding: 
Record{m’^) is defined on singles by 

Record{X i> x = e) = {X = x) and Record{- [> x = e) = £, 

naturally extended to components and structures by concatenation, 
and Bind{m'^,e) is defined inductively over m‘^ by 

Bind{e, e) = e 

Bind{{[u‘^i . . . u‘^n],m‘^),e) = letrec in Bind{m^, e) 

Bind{{u^ , m^) , e) = let in e), 

with [L > X = ej = (x = e). 

For each component. Bind defines a letrec (if the component is a block) or a 
let (if the component is a single), by extracting bindings x = e from singles 
L [> (x = e). 

Other rules. Rule Delete describes the action of the deletion operation. Given 
a basic mixin (m), {m)\_x reduces to {Del{m, X)) , where Del{m,X) denotes m, 
where any definition of the shape A > x = e is replaced with X t> x = •. 
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The next two rules, LetRec, Let, handle value binding. The only non- 
obvious rule is LetRec, which enforces the following behavior. The idea is that 
the rule applies when the considered binding is fully evaluated, which is always 
the case for proper terms. A pre-term letrec b in e reduces to e, where each 
X € dom(b) is replaced with a kind of closure representing its definition, namely 
letrec b in b{x). Note the notation for capture-avoiding substitution. 

Finally, rule Select defines record selection, and rule Context extends 
the rule to any evaluation context. 



4 Static Semantics 

We now define a sound type system for Mix terms. Defining it on terms rather 
than pre-terms means that the considered expressions are well-formed by defi- 
nitions. Types are defined in figure 4. A Mix type r can be either a record type 
or a mixin type. A mixin type has the shape (M), where M is a signature. A 
signature is a list of specifications C, which can be either single specifications U 
or block specifications Q. A single specification has the shape SX : t where 6 is 
a flag indicating whether the considered name is a declaration of a definition. It 
can be either ?, for declarations, or !, for definitions. A block specification is a list 
of single specifications. Record types are finite maps from names to types. Types 
are identified modulo the order of specifications in blocks. Environments F are 
finite maps from variables to types. The disjoint union of two environments Fi 
and F 2 is written Fi + F 2 (which applies only if their domains are disjoint). 

Figure 5 presents our type system for Mix. 

Basic mixin modules and enriched specifications. Let us begin with the typing of 
basic mixins. Rule T-Struct simply delegates the typing of a basic mixin (m) 
to the rules for typing structures. These rules basically give each component c an 
enriched specification, which is a specification, enriched with the corresponding 
variable. Formally, single enriched specifications have the shape 6L \> x : t, and 
enriched block specifications are finite sets of these. Notably, this allows to type 
anonymous definitions (using enriched specifications like 6-t> x : t), and also to 
recover a typing environment (namely {x i— > r}) for typing the next components. 
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Expressions 

T-Struct 

r h Cl . . . c„ : M" 

rh (ci...c„) : {Sig{M’^)) 



T-Compose 

rhei:{Mi) rhe2:{M2) 
rhei »C2 : {Add{Ah,M2,e)) 



T-Close T-Delete 

The: (M=) R h (m) : (M) 

r h close e : {RecordiM^")} R h (m)|_x : {Del{M,X)) 



T-Var 
r h a: : r(a:) 



T-Record T-Select 

dom(s) = dom(S') VX e dom(s), T h s(X) : S(X) F \- e : {S} 

r\-{s}: {S} rh e.X : S{X) 



T-LetRec T-Let 

r + Fb b : Fb F + Fb e ■. r F \- ei : ti R + {x h- > ri } h C2 : T2 
F h letrec 6 in e : r R h let a; = ei in 62 : T2 



Singles 



T-Some 

rh e : r 

F \- {L t> X = e) ■. {\L r> X : t) 



T-None 

F \- {X f> X = •) : { 7 X f> X ■. r) 



Structures and bindings 

T-Empty 
r h £ : £ 



T-Single 

F\-u:U^ F + EnviV) h m : 



n- (u,m) : (17", M") 



T-Block 

r + r, h m : M" Rj = y Env{U\) 

uGq 

'iu £ q,u £ RecExpj and F + Fq\- u ■. iF^u 
Fh{[q],m) : i[\^U\],M^) 

uGq 

T-Binding 

dom(b) — dom(Fb) 

Vx € dom(b),b(x) £ RecExp and Fb h b(x) : Fb(x) 
Fhb:Fb 



Fig. 5. Type system 



Enriched single specifications, block specifications, and signatures are denoted 
by C/®, Q®, and M®, respectively. Once the structure m has been given such 
an enriched signature M®, this result is converted to a proper signature M = 
Sig(M‘^), assigning to the basic mixin (m) the type (M). The Sig function merely 
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forgets variables and anonymous definitions of its argument: it is defined by 
straightforward extension of 

Sig{SX t> X : t) = SX : t Sig{6- t> x \ t) = e. 

Here is how structures are given such enriched signatures. By rule T-Some, 
a single definition L > a; = e is given the enriched single specification \L \> x : t 
if e has type r. By rule T-None, a single declaration X ^ x = • can be given 
any enriched specification of the shape ?X l> x \ t. 

Given this, we can define the typing of structures. By rule T-Empty, an 
empty structure is given the empty signature. By rule T-Single, a structure 
of the shape {u,m) is typed as follows. First, u is typed, yielding an enriched 
specification G® . This C/® is made into an environment by the Env function from 
enriched signatures to environments. This function associates to any enriched 
single specification SL \> x : t the finite map {a; r}, and is straightforwardly 

extended to signatures. The obtained environment is added to the current envi- 
ronment for typing the remaining components inductively, yielding an enriched 
signature M®. The type of the whole structure is (f7®, M®). 

By rule T-Block, a structure of the shape ([g],m) is typed as follows. 
An enriched single specification is guessed for each single u of q. Then, 
the set of these enriched single specifications is converted into an environment 
Eq = yEnv([/®„). This environment Eg is added to the current environment. 

uSiq 

Then, it is checked that each single u indeed has the enriched specification 
Additionally, it is checked that each single at of g is defined by a valid recursive 
expression or is a declaration. By abuse of notation, we write this u G RecExp-y. 
Finally, the structure m is typed, yielding an enriched signature M®, which is 
concatenated to 



Composition. The typing of composition, defined by rule T-Compose, recalls 
its dynamic semantics. The type of the composition of two mixins of types (Mi) 
and (M 2 ), respectively, is (Add(Mi, M 2 , e)), where Add is defined by 

Add{e,Mi,M2) = Ml, M 2 
Add((Mi,C),M 2 ,M 3 ) = Add(Mi,M 2 ,(C,M 3 )) 

if DJV(C) n DN{M 2 , M 3 ) = 0 
Add((Mi,Ci),(Mi,C2,M|),M3) = Add(Mi,Mi,(C'i®C'2,M|,M3)) 

if DJV(Ci) n DN{M^,Mi, M 3 ) = 0 and DN{Ci) n DN{C 2 ) yf 0 

which does the same as Add on structures. The merging of two specifications is 
similarly defined by 



Cl 0 C 2 
{7X -.t)®C 
[Qi] 0 [Q2] 
[IX : T, Qi] 0 [Q2] 



C 2 0 Cl 

C if C{X) = T 

[Qi,Q2] if DN{Qi) n DN{Q2) = H) 
[Ql] 0 [Q2] if Q2{X) = T 
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It differs from component merging, because it checks that the types of match- 
ing specifications are the same. 

Example 2. When some mutually recursive definitions are grouped together in 
blocks, rule T-Block ensures that they are all defined by valid recursive expres- 
sions. Let us now show how the type system rules out mutually recursive defini- 
tions that are not in blocks. Assume given mixins with types ej : {lY : ry , !X : 
Tx) and eg : {IX : ry,!F : ry). When typing their composition ei eg, the 
component X in ei, by the third rule in the definition of Add, is merged with 
its counterpart in eg. This pushes the component Y of eg to the right, so that 
we obtain the triple (?Y : Ty,e, (!A : tx, ^-Y : ry)), to which no rule applies. 

Other rules. Rule T-Close types instantiation. Following previous notation, 
we let denote complete signatures. Given a complete mixin of type {M^), 
close makes it into a record. The type of the result is {Record(M'^)}, which is 
obtained by flattening the blocks in M°, forgetting the ! flags. 

Rule T-Delete types deletion. For a mixin e of type (M), the rule gives e.\-x 
the type {Del{AI, X)), in which Del{AI, X) denotes M, where any declaration of 
the shape \X : r is replaced with IX : t. 

The other typing rules are straightforward. 

Soundness. The type sytem is sound, in the sense that the following results hold. 

Lemma 1 (Subject reduction) 

If r \- e \ T and e — *■ e', then E \- e' : t. 



Lemma 2 (Progress) 

If ib \- e : T, then either e is a value, or there exists e' such that e ^ e' . 



Theorem 1 (Soundness) 

Iffb\~e\T, then either e reduces to a value, or its evaluation does not terminate. 



5 Related Work 

Kernel calculi with mixin modules. The idea of mixin modules comes from that of 
mixins, introduced by Bracha [4] as a model of inheritance. In this model, called 
Jigsaw, classes are represented by mixins, which are equipped with a powerful 
set of modularity operations, and can be instantiated into objects. Syntactically, 
mixins may contain only values, which makes them as restrictive as classes. What 
differentiates them from classes is their cleaner design, which gave other authors 
the idea to generalize them to handle modules as well as objects. 

Ancona and Zucca [2] propose a call-by-name module system based on some 
of Bracha’s ideas, called CMS. As Mix, CMS extends Jigsaw by allowing any 
kind of expressions as mixin definitions, not just values. Unlike in Mix, in CMS, 
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there is no distinction between modules and mixin modules, which makes sense 
in call- by-name languages, since the contents of modules are not evaluated until 
selection. In call- by- value, the contents of a module are eagerly evaluated, so they 
cannot have a late binding semantics. Thus, modules must be dinstinguished 
from mixin modules, and so CMS is not a suitable model. From the standpoint 
of typing, CMS, unlike Mix, but consistently with most call-by-name languages, 
does not control recursive definitions. 

The separation between mixin modules and modules, as well as late binding, 
can be encoded in Wells and Vestergaard’s m-calculus [20], which however is 
untyped, and does not provide programmer control over the order of evaluation. 

In a more recent calculus [1], Ancona et al. separate mixin modules from 
modules, and handle side effects as a monad. However, they do not attempt 
to statically reject faulty recursive definitions. Moreover, in their system, given 
a composition e\ ^ e^, the monadic (i.e., side-effective) definitions of e\ are 
necessarily evaluated before those of 62, which is less flexible than our proposal. 



Language designs with mixin modules. Duggan and Sourelis [8] propose an ex- 
tension of ML with mixin modules, where mixin modules are divided into a 
prelude, a body, and an initialization section. Only definitions from the body are 
concerned by mixin module composition, the other sections being simply con- 
catenated (and disjoint). Also, the body is restricted to functions and data-type 
definitions, which prevents illegal recursive definitions from arising dynamically. 
This is less flexible than Mix, since it considerably limits the alternation of 
functional and computational definitions. 

Flatt and Felleisen [10] introduce the closely related notion of units, in the 
form of (1) a theoretical extension to Scheme and ML and (2) an actual ex- 
tension of their PLT Scheme implementation of Scheme [9]. In their theoretical 
work, they only permit values as unit components, except for a separate initial- 
ization section. This is more restrictive than Mix, in the same way as Duggan 
and Sourelis. In the implementation, however, the semantics is different. Any ex- 
pression is allowed as a definition, and instantiation works in two phases. First, 
all fields are initialized to nil; and second, they are evaluated and updated, one 
after another. This yields both unexpected behavior (consider the definition x 
= consd, x)), and dynamic type errors (consider x = x + 1), which do not 
occur in Mix. Finally, units do not feature late binding, contrarily to Mix. 



Linking calculi. Other languages that are close to mixin modules are linking 
calculi [5, 18]. Generally, they support neither nested modules nor late bind- 
ing, which significantly departs from Mix. Furthermore, among them, Cardelli’s 
proposal [5] does not restrict recursion at all, but the operational semantics is 
sequential in nature and does not appear to handle cross-unit recursion. As a 
result, the system seems to lack the progress property. Finally, Machkasova and 
Turbak [18] explore a linking calculus with a very rich equational theory, but 
which does not restrict recursion and is untyped. 
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Flexible mixin modules. In the latest versions of MM [12], a solution to the 
problem of dependency graphs in types is proposed. Instead of imposing that 
the graph in a mixin module type exactly reflect the dependencies of the con- 
sidered mixin module, it is seen as a bound on its dependencies, thanks to an 
adequate notion of subtyping. Roughly, it ensures that the considered mixin 
module has no more dependencies than exposed by the graph. This allows two 
techniques for preventing MM mixin module types from being verbose and over- 
specified. First, the interfaces of a mixin module e can be given more constrained 
dependency graphs than that of e. This makes interfaces more robust to later 
changes. Second, a certain class of dependency graphs is characterized, that bear 
a convenient syntactic description, thus avoiding users to explicitly write graphs 
by hand. In fact, this syntactic sugar allows to write MM types exactly as Mix 
types. We call the language obtained by restricting MM to such types 

(in a way that remains to be made precise, for instance by insertion of implicit 
coercions). 

The expressive power of typed MM'^ w.r.t. reordering lies between MM and 
Mix. Intuitively, the order in Mix mixins is fixed at definition time, while 
allows later reordering of input components. For instance, the Mix basic mixin 
e\ = {X \> X = •,¥ \> y = •) cannot be composed with 62 = (}.Y [> y = 0, !X [> 
a; = 0), which is unfortunate. The equivalent composition is well-typed in MM^. 

The importance of this loss of flexibility has to be further investigated. In- 
tuitively, it can only be annoying when a mixin is reused in an unexpected way, 
which makes the initial order incorrect. Unfortunately, the classical examples 
using mixins modules (and modules in general) generally show the modular de- 
composition of programs, not really the reuse of existing code, with possibly 
badly ordered components. This comes from the lack of extensive practice of 
any system with mixin modules, which is one of our priorities for future work. 

6 Future Work 

Type components and subtyping. Before to incorporate mixin modules into a 
practical language, we have to refine our type system in at least two respects. 
First, we have to design an extended version of Mix including ML-style user- 
defined type components and data types. This task should benefit from recent 
advances in the design of recursive module systems [6, 7]. Second, we have to 
enrich our type system with a notion of subtyping over mixin modules. Indeed, 
it might turn out too restrictive for a module system to require, as Mix does, a 
definition Ailing a declaration of type r to have exactly type r. 

Compilation. The Mix language features anonymous definitions, and thus the 
compilation scheme for mixin modules proposed by Hirschowitz and Leroy [14] 
does not apply. A possible extension of this scheme to anonymous definitions is 
sketched in later work [12], but not formalized. This extension might apply to 
Mix. However, it should be possible to do better than this, by taking advantage 
of the more rigid structure of Mix mixin modules. 
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Other definitions of composition. The composition operator of Mix is somewhat 
arbitrary. This gives the idea to explore other definitions, perhaps more fiexible. 
Ideally, the system should be parameterized over the notion of composition. 
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Abstract. We propose a semantic framework for modelling the linear 
usage of continuations in typed call-by-name programming languages. 
On the semantic side, we introduce a construction for categories of linear 
continuations, which gives rise to cartesian closed categories with “lin- 
ear classical disjunctions” from models of intuitionistic linear logic with 
sums. On the syntactic side, we give a simply typed call-by-name A/r- 
calculus in which the use of names (continuation variables) is restricted 
to be linear. Its semantic interpretation into a category of linear con- 
tinuations then amounts to the call-by-name continuation-passing style 
(CPS) transformation into a linear lambda calculus with sum types. We 
show that our calculus is sound for this CPS semantics, hence for models 
given by the categories of linear continuations. 



1 Introduction 

1.1 Linearly Used Continuations 

Recent work on linearly used continuations by Berdine, O’Hearn, Reddy and 
Thielecke [7,8] points out the advantage of looking at the linear usage of contin- 
uations in programming languages. They observe: 

... in the many forms of control, continuations are used linearly. This 
is true for a wide range of effects, including procedure call and return, 
exceptions, goto statements, and coroutines. 

They then propose linear type systems (based on a version of intuitionistic linear 
logic [13,2,3]) for capturing the linear usage of continuations, where the linear 
types are used for typing the target codes of continuation-passing style (CPS) 
transforms, rather than the source (ML or Scheme, for example) programs. Sev- 
eral “good” examples are shown to typecheck, while examples which duplicate 
continuations do not. An instance of such situations is found in a recent work 
on axiomatizing delimited continuations [19] where the linear usage of metacon- 
tinuations is crucial. 

Motivated by Berdine et al.’s work, in a previous paper [14] we have devel- 
opped a semantic framework for linearly used continuations (and more generally 
linearly used effects) in typed call-by-value (CBV) programming languages in 



Y. Kameyama and P.J. Stuckey (Eds.): FLOPS 2004, LNCS 2998, pp. 229-243, 2004. 
© Springer-Verlag Berlin Heidelberg 2004 



230 Masahito Hasegawa 



terms of models of linear type theories. In particular, the CBV CPS transfor- 
mation is naturally derived as an instance of general monadic transformation 
into the linear lambda calculus in this framework, and we have shown that the 
CPS transformation enjoys good properties, most notably the full completeness 
(“no-junk property”). Further results including a fully abstract game semantics 
have been given by Laird [20]. Thus the semantic analysis on linear CPS in the 
CBV setting has been shown fruitful and successful to some extent. 

The present paper proposes an analogous approach for linearly used contin- 
uations in call-by-name setting. Thus we first seek for the semantic construction 
which gives a model capturing the linearity of the usage of continuations from 
a model of linear type theory, and then extract the call-by-name CPS transfor- 
mation into the linear lambda calculus from the construction. In this way we 
provide sound models of the call-by-name A/i-calculus [23] in which the use of 
names (continuation variables) is restricted to be linear. Proof theoretically, this 
restriction prohibits us to write programs (proofs) of many of “classical” types, 
because the disjunction type is used only linearly. We still have the excluded 
middle ~^A V A (because it is isomorphic to A ^ A in this world), but not 
the double-negation elimination ^^A A (equivalently -^^^A V A) in general. 
This means that the typing for linearly used continuations is placed somewhere 
between the intuitionistic and classical ones [1] . 



1.2 Semantic Construction: Categories of Linear Continuations 

The central semantic construction in this work, though rather simple and possi- 
bly folklore among specialists, is that of categories of linear continuations, which 
can be considered as a generalization of two well-known constructions of carte- 
sian closed categories: 

1. The semantic counterpart of the (call-hy-name) double-negation translation 
from classical logic to intuitionistic logic, we construct a cartesian closed 
category from a cartesian closed category with sums as the opposite of the 
Kleisli category of the “continuation monad” ((— ) ^ R) R, also known 
as the category of continuations [16,26]. 

2. The semantic counterpart of the Girard translation from intuitionistic logic 
to linear logic: we construct a cartesian closed category from a model of linear 
logic as the co-Kleisli category of the comonad !(— ) = ((— ) ^ X) ^ X 
(where we assume the presence of products) — equivalently as the opposite 
of the Kleisli category of the monad ?(— ) = ((— ) ^ X) — > X (where we 
need sums). 

The view of regarding modalities ! and ? as expressing “linearly used continua- 
tions” and “linearly defined continuations” has been emphasized in our previous 
work [15] (and also implicit in Filinski’s work [12]), and it helps us to understand 
these two situations as instances of a single setting. Starting from a model of 
linear logic with sums, we construct a cartesian closed category as the opposite 
of the Kleisli category of the ?-like monad T(— ) = ((— ) — o i?) ^ i?. 
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One technically interesting point is that monads of this form are in general 
not strong — they only have “strength with respect to !” [10]. Thus they seem 
less useful in the call-by-value setting (because a monad needs to be strong for 
interpreting a reasonable “notion of computation” in the sense of Moggi [22]). 
This also implies that the induced operators on objects for the “linear classical 
disjunction” does not form a premonoidal structure [24] — the object function 
A V (— ) does not extend to a functor. 

1.3 Organization of This Paper 

This article is organized as follows. In Sect. 2 we recall the semantics and syntax 
of the linear lambda calculus which serves as the target of our CPS transforma- 
tion. Sect. 3 introduces the construction of categories of linear continuations. In 
Sect. 4 we consider the A/r-calculus with linear controls, and spell out the CPS 
transformation derived from the semantic construction of the last section. Sect. 
5 concludes the paper. Appendices summarize the linear lambda calculus DILL 
and the notion of !-strong monads. 

2 Preliminaries 

2.1 Categorical Models of Linear Logic 

We describe models of linear logic in terms of symmetric monoidal closed cate- 
gories with additional structure - suitable comonad for modelling the modality 
“of course” !, and finite products/coproducts for modelling additives, and a du- 
alising object (hence *-autonomous structure [4,5,25]) for modelling the duality 
of classical linear logic. For reference, we shall give a compact description of the 
comonads to be used below, due to Hyland and Schalk (which is equivalent to 
Bierman’s detailed definition [9], and also to the formulation based on symmetric 
monoidal adjunctions [6,3]). 

Definition 1 (linear exponential comonad [17]). A symmetric monoidal 
comonad ! = (!, e, 5, w/) on a symmetric monoidal category C is called a 

linear exponential comonad when the category of its coalgebras is a category of 
commutative comonoids ~ that is: 

— there are specified monoidal natural transformations Ca d.A ^ I and dA '■ 
\A ^\A®\A which form a commutative comonoid {\A^CA,dA) in C and also 
are coalgehra morphisms from (!A, J^i) to {I, mi) and {\A®\A,m\A,\Ao(5A® 
(j^)) respectively, and 

— any coalgebra morphism from (\A,6 a) to (!B,Sb) is also a comonoid mor- 
phism from (!A, cajcJa) to {\B,eB,dB)- 

2.2 Dual Intuitionistic Linear Logic 

The target calculus we will make use of is the multiplicative exponential fragment 
of intuitionistic linear logic, formulated as a linear lambda calculus summarized 



232 Masahito Hasegawa 



in Appendix. Our presentation is based on a dual-context type system for in- 
tuitionistic linear logic (called DILL) due to Barber and Plotkin [2,3]. In this 
formulation of the linear lambda calculus, a typing judgement takes the form 
r ■, A \- M ■. T ui which F represents an intuitionistic (or additive) context 
whereas Z\ is a linear (multiplicative) context. It has been shown that symmet- 
ric monoidal closed categories with linear exponential comonad provide a sound 
and complete class of categorical models of DILL [3] . 

In the sequel, it turns out to be convenient to introduce syntax sugars for 
“intuitionistic” or “non-linear” function type 

= !ti ^ T2 

= Aj/'”.let \x'^ be w in M 
= M{\N) 

which enjoy the following typing derivations. 

r,x : Ti ] A \- M \ T2 r ; a \- M : Ti ^ T2 D ; 0 h IV : ti 
r ; A\- Act”'- .M ^ T2 F ; A\- M @N : T2 

As one expects, the usual / 377 -equalities (Xx.M) @N = M[N/x] and Xx.M ®x = 
M (with X not free in M) are easily provable from the axioms of DILL. (In fact 
it is possible to have them as primitives rather than derived constructs [7,8,15].) 

In addition to the constructs of DILL, we need to deal with (additive) sum 
types. Here we employ the fairly standard syntax: 

r-, AhM:0 
F \ A \- aborto- M : a ^ 



Tl T2 

XxFM 



F ; A h M : cr 
F ■ A\- inlcr.r M : cr © T 



(®Il) 



F ; Ah N :t 
F ] Ah inrcr.T N : a S) t 



(®Ifi) 



F ; Ai h L : a 0) T F ; A 2 , x : a h M : 9 F ; A 2 , y : t h N : 9 

( ® ll; ) 

F ; Aij)A 2 h case L of ini x'^ 1 -^ M \\ \nr y'^ 1 -^ N : 9 

3 Categories of Linear Continuations 

Let C be a symmetric monoidal closed category with a linear exponential comonad 
! and finite coproducts (we write 0 for an initial object and © for binary coprod- 
ucts). Fix an object R, and define a category as follows: i?^’s objects are the 
same as those of C, and arrows are given by 

R^{A,B) = C{l{A^ R),B ^ R). 

The identities and compositions in R^ are inherited from the co-Kleisli category 
C\ of the comonad ! (so, up to equivalence, Rp can be considered as the full 
subcategory of C\ whose objects are of the form A ^ R). 
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As easily seen, R^{A, B) ~ C{B,\{A —oR)—o R), thus is isomorphic to 
the opposite of the Kleisli category of the monad TX =\{X —oR) ^ R on C. 
Note that this monad is not necessarily strong - but it is strong with respect to 
! (i.e., has a restricted form of strength lA 0 TX T{\A 0 A)) - see Appendix 
for the definition of 1-strong monads. This notion is introduced by Blute et ah 
[10] for axiomatising the exponential “why not” ?. A 1-strong monad may not 
be strong, though it induces a strong monad on the co-Kleisli category C\ [14]. 

Proposition 1. The monad TX =!(A -« R) ^ R onC is l-strong. 

In terms of DILL, the l-strength is represented as follows. 

a : A ; m : {X R) ^ R\- .m <S {\x^ .k (!a 0 x)) 

: ((IA0X) ^R) ^R 

We shall note the non-linear use of the variable a : A. 

Note that the exponential ?(— ) =!((—) — o X) — o X is a particular instance 
of this — see Example 2 below. Another typical example of !-strong (but not 
necessarily strong) monads is the exception monad X X Q E. 

3.1 Cartesian Closnre 

A category of linear continuations has sufficient structures for modelling the 
simply typed lambda calculus [21]: 

Proposition 2. i?'' is a cartesian closed category. 

Proof: Define 

T = 0 A^B = A®B A^B = \{A^R)0B 

We shall see that T is a terminal object, A A B is a binary product of A and B, 
and A is an exponential of i? by A in R!^ . 



R^{A,T) = C{l{A^ R),0^ R) 
~ C(0,!(A^i?) 

~ 1 



R^{A,BAC) 



R^{AaB,C) 



C{\{A^ R),{B0C) -o R) 

C{B0C,\{A^ R) ^ R) 

C{B,\{A ^R)^R)x C{C, !(A ^ R) ^ R) 
C(!(A ^ R),B ^ R) X C(!(A ^ R),C ^ R) 
R^\a,B) X R^{A,C) 

C(!((A©B) ^ R),C^R) 

C(!(A ^ R)0\{B ^R),C^ R) 

C(!(A ^ i?), (!(5 ^R)0C)^R) 

r^\a,b^c) 
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Here we use an isomorphism !((H©H) — o i?) ~ !(H ^ R)(S)\{B R) which can 
be thought as an instance of the “Seely isomorphism” \{XSzY) ~\X®\Y [25,9] 

ini— oi? inr — oR 

as we have a product diagram A ^ R < {A(B B) ^ R > B ^ R. Since a 

linear exponential comonad ! arises from a symmetric monoidal adjunction from 
a category with finite products [9,6,3], it follows that ! sends a product of X and 
Y (if exists) to IX^IY up to coherent isomorphism. □ 

3.2 Disjunctions 

It is natural to define the (linear) disjunctions on R^ as 

_L = / Ay B = A® B 

which satisfy the isomorphisms one would expect for “classical” disjunctions: 

Proposition 3. The following isomorphisms exist: 

A^B ~ {A^±)yB 
A^{ByC) ~ {A^B)yC 

However, they are not even premonoidal (because the monad T is not strong). 
The functor H © (— ) on C does not give rise to a functor on i?^. 

We also note that these linear disjunctions do not give weak coproducts in 
general. For instance _L is not a weak initial object: 

R^{±,X) = C{\{! ^ R),X ^ R) ~ C{X,\R^R) = C{X,R^ R) 

Hence we can define the canonical map from _L to only objects of the form IX. 

3.3 Examples 

As mentioned in the introduction, the categories of linear continuations subsume 
two well-known constructions of cartesian closed categories: one for the call-by- 
name double-negation translation from classical logic to intuitionistic logic, and 
the other for (the dual of) the Girard translation from intuitionistic logic to linear 
logic. For the former, it suffices to simply trivialize the linearity. For the latter, 
we let the response object R be the dualising object (linear falsity type) X.^ 

Example 1 (Categories of continuations). Let C be a cartesian closed category 
with finite coproducts and an object R. By taking the identity comonad as 
the linear exponential comonad, we have a sufficient structure for constructing 
a category R^ of (linear) continuations. Its objects are the same as C, with 
i?‘'(A, B) = {R^, R^), together with the terminal object 0, binary product A(BB 
and exponential R^ x B. This is exactly the category of continuations [16,26]. 
Note that, in this case, the monad T is the standard continuation monad and is 
strong, hence the classical disjunction is premonoidal. 

® This should not be confused with the classical falsity T introduced in the last section. 
In this paper we use T and X for the classical falsity and linear falsity (dualising 
object) respectively. 
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Example 2 (Girard translation). Suppose that C is ^-autonomous [4], thus has 
a dualising object X and can model classical linear logic [25,5]. We note that 
its opposite C°p is also *-autonomous, with a linear exponential comonad 7X = 
\{X ^ X) ^ X. By letting R be X, we have 

X'^°'’(A,B) ~ C°P(?(y4^X),S^X) ~ C°P(B,!A) =C(!A,^) = G{A,B). 

pop 

Thus the derivation of the cartesian closure of X is exactly the well-known 
“decomposition” A ^ B = \A ^ B (given a symmetric monoidal closed category 
C with a linear exponential comonad ! and finite products, the co-Kleisli category 
C\ is cartesian closed). Sec. 4. 5 gives a syntactic interpretation of this observation. 

Of course, there are many categories of linear continuations which do not fall 
into these two extreme cases. For instance: 

Example 3. Let C be the category of w-cpo’s (with bottom) and strict continuous 
functions, ! be the lifting, and R be any object. The category R^ has the same 
objects as C, but its morphism from A to i? is a (possibly non-strict) continuous 
function between the strict-function spaces A — o i? and B ^ R. 

3.4 Discussion: Towards Direct-Style Models 

Ideally, we would like to find a direct axiomatization of the categories of linear 
continuations as cartesian closed categories with extra structure — as neatly 
demonstrated in the non-linear case by Selinger as control categories and its 
structural theorem with respect to categories of continuations [26]. The main 
difficulty in our linear case is that the linear classical disjunction is no longer 
premonoidal, and we do not know how to axiomatize them. So there seems 
no obvious way to adopt Selinger ’s work to define a notion of “linear control 
categories” . 

But there still are some hope: we can consider the category i?'' defined by 
Rp {A, B) = C {A ^ R,B -o R), which can be regarded as a lluf subcategory 
of linear maps in R^ (provided the counit is epi). The disjunctions do form 
a symmetric premonoidal structure on i?'' Moreover, the category i?'' has 
finite products and the inclusion from ii'' to i?'' preserves them. 

So it might be natural to formulate the structure directly as a cartesian 
closed category together with a lluf subcategory with finite products (preserved 
by the inclusion) and distributive symmetric premonoidal products (but not 
with codiagonals as required for control categories), satisfying certain coherence 
conditions — e.g. on the isomorphism A (B V C) ~ (A B) V C. 

4 The A^i-Calculus with Linear Controls 

We formulate the calculus for expressing “linearly used continuations” as a con- 
strained A/i-calculus where names (continuation variables) are used and bound 
just linearly. Here we make use of the syntax for the simply typed A/i-calculus 
with disjunctions [26], together with a typing system which represents this lin- 
earity constraint on names. 
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4.1 The Calculus 

Types and Terms 

a ::= &|(T=>cr|T|crAcr|_L|crV(T 
M ::= a; I \x^ M \ M M \ * | (M, M) | tti M | 7T2 M | 
[a]M I .M \ [a,a\M \ p{a'^ ,a'^).M 

where b ranges over base types. 

Typing 



r, X : a \- M : T \ A 
r h Xx'^ .M : a^T \ A 

r h * : T I 0 

T h M : (T A T I 0 
T h 7Ti M : cr I 0 

rh M : cr I Z\ 

T h [a]M : _L I {a : cr}ttZ\ 

T\-M\ (j\/t\A 
T h [a, /3]M : _L | {a : a, (3 : r}jlZ\ T h p{a°' , : cr V r | Z\ 

where Z\it|Z \2 represents one of possible merges of Ai and Z \2 as finite lists. We 
assume that, when we introduce Z\it|Z\ 2 , there is no name occurring both in Ai 
and in Z\ 2 . We write 0 for the empty context. Note that names cannot be free 
in the argument of a function application. 

Example 4- The reader is invited to verify that 

,a'^).[K\{\x'^ \a]x) : (cr _L) V cr 

typechecks, while 

.pa!^ .■m{Xx'^ \a\x) : {{a ^ _L) _L) ^ cr 

does not — the name a occurs in the argument of a function m which may 
duplicate or discard a. 

Example 5. The disjunction (— ) V r fails to be functorial: given a term M : a ^ 
a' , one might expect to have 

M y T = Xz'^'^'^ .p{a"^ , ff)\a]{M {pa’^ \a^ j3]zy) : cr V r ctW r 
which is illegal because M may not use (3 linearly. 



E\- M ■. cr^r I Z\ Th : cr I 0 
r h M iV : T I Z\ 

ThM:cr|Z\i ThW:r|Z \2 
Th (M,iV) : ctAt I Z\it|Z\2 

T h M : cr A r I 0 
T h 7T2 M : T I 0 

T h M : _L I a : cr, Z\ 
r h pa'^ .M : cr I Z\ 

r M :_L \ a a, [3 : T, A 
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4.2 The Call-by-Name CPS Interpretation 

The cartesian closed structure of the category of linear continuations motivates 
the following interpretation of the arrow type 

|cr ^ r] ~ !(|cr] ^ i?) 0 |r] 

which leads us to interpret a typing judgement as follows. 

|xi . (7i, . ■ . , Xjji . (Tjyi P AL . (J I CX\ . , . . . , . Tj.jJ . 

!(|cti] ^ i?) 0 . . . 0!(|(Tm] ^ i?) 0 |ti] 0 . . . 0 |r„] > |cr] ^ R 

Rather than describing the translation in terms of categorical combinators, below 
we shall give it as a CPS transform into DILL with sums. For types we have 



b° = b 




—\ 

o 

II 

O 

1 — 
o 

II 


(cr => t)° = !(cr° 




R) 0 T° (cr A r)° = cr° 0 r° (ct V t)° = a° ® t° 


and for terms 






x° 


= 


X 


O 

b 


= 


A(!x'^°^-«0fc”°).M°fc 


N'^)° 


= 


Xk^°.M° {\N° k) 




= 


Afc'’. abort ij k 




= 


Afc'^ .case k of inl(x) M° x || inr(y) N° y 


(tti 


= 


Xk'^\M° (inifc) 


(7T2 M'^'^^Y 


= 


Xk^° .M° (inrfc) 


{[a]MY 


= 


A*^ M°a 


{fia'^.MY 


= 


Xa'^\M°* 


([a,/3]M)° 


= 


A .M° (a 0 P) 


{f,{aYn-MY 


= 


A(a‘"° 0/3”°).M°* 


Also we use some “pattern matching binding” , e.g. X{x°' ®y'^).M as a shorthand 


for Az'^®'”.let 0 y” 

judgement 


be 


2 in M, and A .M for Az^.let * be 2 in M. A typing 


xi : CTi, . , 


■ • ^ Af . (J 1 OL\ . Ti , . . . , Oifi . T^i 



is sent to 



xi : a° ^ R, Xjn ■ cr'^ ^ R ; cti : T° an ■■ T° \- M° ■. a° ^ R. 

Note that, if we ignore all the linearity information, this is precisely the same as 
the call-by-name CPS transformation of Selinger [26]. 

Example 6. The well-typed term 

,a'^).[K\{\x'^ \a]x) : (cr _L) V ct 

is sent to 

A((!x‘^ 0 0 a!^ ).xa : ((!(cr° R) ® I) ® a°) — ° R 

which essentially agrees with the transform of the identity function 
{Xx"^ ,x)° = \{\x'^ 0 k'^ ).x k : (!(cr° ^ i?) 0 a°) —o R. 
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4.3 Axioms and Soundness 



It is routine to see that this CPS transform validates the / 377 -equalities of the 
simply typed lambda calculus with products (this is a consequence of the carte- 
sian closedness of the categories of linear continuations). In fact all axioms for 
the call-by-name A/r-calculus (as given by Selinger [26]) are valid, except the 
non-linear axiom (/3j_) : [of’^jM = M which is replaced by its linear variant 
.C[[a\M\ = C[M\ below. 



Axioms 



(/3^) 


{Xx.M)N 


= M[N/x] 


iv^) 


Xx.M X 


= M[N/x] {x^FV{M)) 


(/3a) 


7Tj {Ml, M 2 ) 


= M, 




(tTi M, 7T2 M) 


= M 


Ivt) 


* 


= M 


(/3m) 


[a']{p,a.M) 


= M[a' /a] 


(?7m) 


pa\a]M 


= M 


(/3v) 


[a',a']{p{a,P).M) 


= M[a' / a, !3' / (3] 


ivv) 


p{a,(3).[a,l3\M 


= M 


if3±) 


.C[[a]M] 


= C[M] 


(U) 


{pa'^^^ .C[[a]M]) N 


= pPCC[[(3]{M N)] 


(Ca) 


TT^{pa'^^'^'^TC[[a\M]) 


= M/3"bC[[/3](7T,M)] 


(Cv) 


[a,fi]{p6.C[[5]M]) 


= C[[a,(3]M] 



Theorem 1. The CPS translation is sound: r\-M = N:a\A implies 
r° ^ R; A° ^ M° = N° : a° ^ R. 

Corollary 1. The interpretation of the calculus into any symmetric monoidal 
closed category with linear exponential comonad and sums is sound: T \- M = 
N : a \ A implies |T h M : a \ Z\] = |T h /V : cr j A] . 

4.4 Discussion: Completeness 

We conjecture that the equational theory of the calculus given above is not just 
sound but also complete for the CPS semantics, hence also the models given by 
categories of linear continuations. (This should be the case, as the usual (non- 
linear) A/x-calculus is likely to be a conservative extension of our calculus and its 
term model gives rise to a category of continuations [16,26], hence a complete 
model.) 

The main problem for showing the completeness, however, is that the types 
and terms are not sufficiently rich to give rise to a category of linear contin- 
uations as the term model; it is not very clear how to derive a base category 
and the response object R, as well as the linear exponential comonad !. This 
also suggests that the calculus may not be sufficient for explaining the nature of 
linear continuation-passing — there seem some room for further improvement. 
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4.5 Girard Translation as a CPS Transformation 

In Example 2, we have noted that the standard construction of cartesian closed 
categories from models of linear logic can be regarded as an instance of our 
construction of categories of linear continuations. This means that Girard trans- 
lation from intuitionistic logic into the (classical) linear logic can also be derived 
as an instance of our CPS transformation and extends to our A/x-calciilus — 
in this reading, it really is a simply typed lambda calculus enriched with par’s. 
The derived translation (obtained by taking the opposite of the term model of 
the linear lambda calculus and then letting R be X) is straightforwardly de- 
scribed as follows, using the linear lambda calculus DCLL [15] as the target. 
For types: b° = b, T° = T, (ct A r)° = a° &t°, (ct r)° = ct° — > r°, _L° = X 
and (fj V r)° = a°'^T° = {a° — o X) — o (r° — o X) — o X. For terms, we have 
x° = X, (Ax.M)° = Ax.M°, {MN)° = *° = (), {M,N)° = 

(tti M)° = fstM°, = sndM°, {[a]M)° = aM°, = C{Xa.M°), 

l[a,f3]M)° = M°aj3 and f3).M)° = Xa.Xp.M°, where the combinator 
Co- : ((cr ^ X) — o X) — o cr expresses the isomorphism from (cr — o X) ^ X to a. 
A judgement F \- M : a \ A is sent to F° ; A° ^ X h M° : cr°. The soundness 
of this translation is just an instance of Corollary 1. 

5 Concluding Remarks 

In this paper we proposed a semantic approach for linearly used continuations in 
call-by-name setting, and developped a relevant semantic construction and also a 
syntactic calculus for such linear controls, together with its CPS transformation. 

However, we must say that this work is still premature — at least not as 
successful as the case of the call-by-value setting for now — and there remain 
many important issues to be addressed. Among them, we already mentioned two 
major problems (which are related each other): (1) the lack of direct-style models, 
and (2) the completeness problem. This situation is really frustrating, compared 
with the call-by-value setting for which we have satisfactory answers for them 
[14]. The non-premonoidal disjunction in particular rises serious obstacles for a 
direct/complete axiomatization. 

Another issue which we wish to understand is how the Filinski duality 
[11,26,27] between call-by-name and call-by-value languages with first-class con- 
tinuations can be related to our approach on linear controls. To sketch the overall 
picture, below we shall list up the interpretations of the function type A ^ B 
in the possible combinations of linearity and non-linearity: 

call-by-name 

non-linear lang. with non-linear control {A —> R) x B 
non-linear lang. with linear control !(A —a R) igi B 
linear lang. with non-linear control (A — > i?) (g) !B 

linear lang. with linear control (A — o i?) (g) B 

The top row was studied in detail by Selinger [26] . The bottom row (purely linear 
setting) is more or less trivial. We are most interested in the second row, but also 



call-by-value 




(B^ 


R)^ 


(A^R) 


(B^ 


R)^ 


(A^R) 


(B-. 


R)^ 


(A- 


-o A) 


(B~. 


R)^ 


(A- 


A) 
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in the third, as there seem to exist certain dualities between the call-by-name 
non-linear language with linear control and the call-by- value linear language with 
non-linear control 

(!(A ^ R)(g)B)^R ~ {A ^ R) ^ {B ^ R), 

and also between the call-by-value non-linear language with linear control and 
the call-by-name linear language with non-linear control 

((A ^ R)0\B) ^ R ~ R)^{B ^ R). 

The practical interest on the third row may be rather limited, but we still hope 
that this duality-based view provides some good insights on the nature of linear 
controls in call-by-name and call-by-value settings, potentially with some tie-up 
with other computational features such as recursion and iteration [18]. 

For more practical side, we have not yet demonstrated the usefulness of 
this approach for reasoning about call-by-name programs with first-class (lin- 
ear) controls. Perhaps this also reflects our lack of experience with call-by-name 
programming languages with first-class control primitives whose practical ad- 
vantage is, we believe, yet to be understood. 

Acknowledgements I thank Oliver Danvy for asking me how linear CPS for call- 
by-name can be semantically understood, and anonymous reviewers for helpful 
comments. 
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A Dual Intuitionistic Linear Logic 

Types and Terms 

a ::= 6|/|o-(8)(t|ct— ocr| !cr 

M ::= X \ * I let * be M in M | M (g) M | let 2:°^ (g) be M in M | 
\x‘^.M I MM I \M I let I®'" be M in M 



Typing 



Fi,x : a,F2 



(Int-Ax) 

\- X : a 



F-, 0 \-*:I 



(/I) 



F 



; (Lin- Ax) 

X : (J \- X cr 



F-AihM-.I F-A 2 hN-.a 

F ; A4A2 h let * be M in Af : cr '' ’ 



F ; Ai h Af : (7i (g (72 

F ■, Ai \- M : ai F ■, A2 N ■. a2 F ■, A2,x ■. ai,y ■. a2 N : t 

F ■ A 4 A 2 h M g TV : (71 g (72 r ; A 4 A 2 h let gt/'"" be M in iV : r ’ 



F \ A,x ■. ai\- M \ (72 , - 

r; A h : cri -« CT2 ^ 



_r ; Zli h M : (7i ^ (72 -T ; A 2 h A" : C7i 
r; A1UA2 I- MA : C72 



HE) 



r; 0 h M : cr 
A; 0 h!M :!cr 



(!I) 



A;Z\ihM :!(7 A, j : (7 ; A; h A : r 
A ; ZiittZ\2 h let !x be M in A : r 



where Ziij)Z \2 represents one of possible merges of A\ and Z \2 as finite lists. 



Axioms 



let * be * \n M — M 
let a; g t/ be M g A in A = L[M/x, N/y] 
(Xx.M) A = M[A/o;] 
let la; be \M in A = A[M/a:] 



let * be M in * = M 
let a; g t/ be M in a; g t/ = M 
Xx.M X = M 
let la; be M in la; = M 



C[let * be M in A] = let * be M in C[A] 

C[let X g j/ be M in A] = let a; g y be M in C[A] 

C[let lx be M in A] = let lx be M in C[A] 

where C[— ] is a linear context (no ! binds [— ]). 



Semantics A typing judgement 

; t/i : Ti h M : (7 

is inductively interpreted as a morphism |xi : cti,... ; y\ : ri,... h M : cr] 
from ! |cTi] g . . . g! Icr^] g |ti] g . . . g |t„] to |(t] in a symmetric monoidal closed 
category with a linear exponential comonad !. 

Proposition 4 (categorical completeness). The equational theory of DILL 
is sound and complete for categorical models given by symmetric monoidal closed 
categories with linear exponential comonads: F ; A\- M = N : a is provable if 
and only i/ |A ; Z\ h M : cr] = |A ; Z\ h A : cr| holds for every such models. 
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B !-Strong Monads 



Let P be a symmetric monoidal category with a linear exponential comonad !. A 
monad (T, ij, on T> is \-strong (or: strong with respect to ! [10]) if it is equipped 
with a natural transformation called [-strength (strength with respect to !) 

9a,x -.IA^TX — > T(IA 0 X) 
subject to the following axioms. 



7(g)TA TX T(I®X) 



mji^TX 



T(mi (giX) 



\I®TX 



T(\I®X) 



!A00b X 'S(5?iX 

\A®(\B®TX) A \A®T(\B®X) T(\A®(\B®X)) 



(\A®\B)®TX 

\(A®B)®TX 



T((\A®\B)®X) 

TirriA^B 

T(\(A<»B)(»X) 



!A® A 





[AiSiTX 



T(!A(g) A) 



^A TX '^^A X 

\A®T'^X ^ ^ T(\A®TX) ^ ^ t'^(\A®X) 



!A®nx 



A*M(g)X 



Sa,x 



\A®TX 



T(\A®X) 
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Abstract. Herbelin presented (at CSL’94) an explicit substitution cal- 
culus with a sequent calculus as a type system, in which reduction steps 
correspond to cut-elimination steps. The calculus, extended with some 
rules for substitution propagation, simulates /3-reduction of ordinary A- 
calculus. In this paper we present a proof of strong normalization for the 
typable terms of the calculus. The proof is a direct one in the sense that 
it does not depend on the result of strong normalization for the simply 
typed A-calculus, unlike an earlier proof by Dyckhoff and Urban. 



1 Introduction 

In [12], Herbelin introduced a sequent calculus in which a unique cut-free proof is 
associated to each normal term of the simply typed A-calculus. This is in contrast 
to the usual assignment (see, e.g. [18, p. 73]) which associates several cut-free 
proofs to the same normal terms. Herbelin developed a term calculus whose 
reduction steps correspond to cut-elimination steps in the sequent calculus. Some 
of the cut rules introduce explicit substitution operators [1], and cut-propagation 
steps of cut-elimination correspond to the propagation of explicit substitutions. 
Herbelin proved strong normalization for the typed terms of his calculus. 

However, he also observed that the reduction rules in the calculus are not 
enough to simulate full /3-reduction (e.g., it fails to simulate the leftmost reduc- 
tion). Espirito Santo [11] and Dyckhoff and Urban [10] identified a set of terms in 
the calculus that correspond to terms in the untyped A-calculus, and introduced 
additional reduction rules needed to allow substitutions to propagate properly. 
Thus it turned out that ordinary A-calculus can be completely embedded in Her- 
belin’s calculus with the additional rules, and in the typed case /3-reduction can 
be analyzed through cut-elimination. 

On the other hand, it is reported [8] that Herbelin’s sequent calculus is par- 
ticularly suited to proof search used in the theory of logic programming. This 
means that the extended Herbelin’s calculus is a promising candidate for a proof- 
theoretic basis for integrating functional and logic programming languages. 

* This work was supported by JSPS Research Fellowships for Young Scientists and by 
Grant-in-Aid for JSPS Fellows. 
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Since the extended Herbelin’s calculus behaves as an explicit substitution 
calculus for the A-calculus, it is expected that various techniques from the field 
of explicit substitutions work as well for this calculus. Dyckhoff and Urban [10] 
indeed proved strong normalization for the typed terms of the calculus, using 
the method of [4] and the result of strong normalization for the simply typed 
A-calculus. Note that as shown in [15], strong normalization for typed terms of 
an explicit substitution calculus is not a trivial property. In fact, a careful choice 
of reduction rules is required for proving strong normalization of cut-elimination 
that simulates /3-reduction. 

In this paper we prove strong normalization for the typable terms of the ex- 
tended Herbelin’s calculus directly without using the result for the simply typed 
A-calculus. Our proof is an adaptation of the reducibility method [17] to explicit 
substitution calculus and to Herbelin-style calculus. For the explicit substitution 
calculus Ax [5], such adaptations have been considered in [6,7]. Compared with 
their proofs, ours makes use of a more general closure condition on reducibility 
sets; they are closed under x-conversion whenever the term is decent (Lemma 11) . 
This is closely related to a lemma for an inductive proof of preservation of strong 
normalization (PSN) as developed in [2,5]. Our method is easily applicable to 
the case of Ax, simplifying the proofs in [6,7]. 

The paper is organized as follows. In Section 2 we introduce the calculus and 
type system. In Section 3 we consider the subset of terms that correspond to 
ordinary A-terms. In Section 4 we study a subcalculus that plays an important 
role in our proofs. In Section 5 we explain how to simulate /3-reduction in the 
calculus. In Section 6 we prove the main lemma, from which we derive PSN, and 
in Section 7 we give a reducibility proof of strong normalization using the main 
lemma. Finally in Section 8 we conclude and give suggestions for further work. 

To save space we omit some of the proofs, but a full version with all proofs 
is available at http://www. math. s. chiba-u.ac.jp/~kentaro/index.html. 

2 Ax-calculus 

Table 1 presents the syntax and typing rules of Ax-calculus, which is the same 
as the calculus {AO + ES + B) in [10] and varies a little from the calculi in 
[11], although we mainly follow notations in the latter. The syntax of Ax has 
two kinds of expressions: terms and lists of terms, ranged over by u,v,t and 
by I, I', respectively. The set of terms is denoted by and the set of lists 
of terms by Elements of U are called Xx-terms and ranged over 
by a,b,c. In a{x := v), {x := v) is called an explicit substitution or simply 
substitution and v is called the body of the substitution. The notions of free and 
bound variables are defined as usual, with an additional clause that the variable 
X in a{x := v) binds the free occurrences of x in a. We assume the following 
variable convention: names of bound variables are different from the names of 
free variables, and, moreover, different occurrences of the abstraction operator 
have different binding variables. The set of free variables of a Ax-term a is denoted 
by FV{a). The symbol = denotes syntactical equality modulo a-conversion. 
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Table 1. Ax-calculus 




A typing context, ranged over by F, is a finite set of pairs {si \ A\, ... ,Xn : 
An} where the variables are pairwise distinct. F,x : A denotes the union TU {x : 
A} where x does not appear in F . There are two kinds of derivable sequents: 
F; — \- t : B and F;A\-l:B, both of which have a distinguished position in 
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Fig. 1. Key-case 




the LHS called stoup. The crucial restriction of this sequent calculus is that the 
rule L D introduces A D B in the stoup and B has to be in the stoup of the 
right subderivation’s endsequent. In the cut-free case, the last rule of the right 
subderivation of an instance of T D is again L D and so on until Ax is reached. 
This yields an assignment of a unique cut-free proof to each normal term of the 
simply typed A-calculus (cf. [18, Section 6.3]). 

The notion of Ax-reduction is defined by the contextual closures of all reduc- 
tion rules in Table 1. We use for one-step reduction, for its transitive 
closure, and for its reflexive transitive closure. The set of Ax-terms that are 
strongly normalizing with respect to Ax-reduction is denoted by These 

kinds of notations are also used for the notions of other reductions introduced 
in this paper. 

The subcalculus of Ax without the Beta-rule is denoted by x. This subcalculus 
plays an important role in this paper and is studied in Section 4. 

Herbelin’s original A-calculus [12] is essentially the calculus without the last 
four reduction rules in Table 1. These rules are necessary for the simulation of 
full /3-reduction of the A-calculus. 

The reduction rules of Ax-calculus also define cut-elimination procedures for 
typing derivations of Ax-terms, which ensures that the subject reduction prop- 
erty holds in this type system. In Figure 1 we display the key-case of the cut- 
elimination, which corresponds to the Beta-rule of Ax-calculus. 

3 Pure Terms 

Table 2 presents the syntax of pure terms, which are the subset of Ax-terms 
that correspond to terms of ordinary A-calculus. The grammar of pure terms is 
close to the inductive characterization of the set of A-terms found, e.g. in [14]. 
For the definition of /3-reduction on pure terms, we need meta-substitution [_/_], 
which requires further meta-operations {_}_ and since a Ax-term obtained 
by substituting a pure term for a variable is not in general a pure term. Note the 
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Table 2. Pure terms 



u,v,t ::= xl I \x.t \ {Xx.t){u :: 1) 
1,1' ::=[]\t::l 



{(3) {Xx.t){u :: 1) {t[u/x]}l 

where 

D@^ =def I 
{u :: l)@l' u :: 

[][V/A =def D 

(w :: l)[v/x] =def u[v/x] :: l[v/x] 

{xi}i' =d,f x{mr) 

=def Xy.t 

{Xy.t}{u :: 1) =def {Xy.t){u :: 1) 

{{Xy.t){u l)}l' =def {Xy.t){u :: 

{yl)[v/x] =def yl[v/x] {y ^ x) 

{xl)\v/x\ =def {v}l[v/x\ 

[Xy.t)[v/x\ =def Xy.t[v/x] 

{{Xy.t){u l))[v/x] =def {Xy.t[v/x]){u[v/x\ :: l[v/x\) 



similarity between the definition of these meta-operations and the reduction rules 
of Ax-calculus. Ax-calculus can be considered in some sense a calculus making 
these meta-operations explicit, while usual explicit substitution calculi make the 
usual substitution explicit. 

Here we give some properties of these meta-operations. 

Lemma 1. For any pure term t S Tj,., {t}[] = t. 

Lemma 2. For any pure terms a G U and v G if x ^ FV{a) then 
a[v/x] = a. 

Lemma 3 (Substitution lemma). For any pure terms a G U and 
u,v G T-^^, if X ^ FV(u) and x ^ y then a[v/x][u/y\ = a[uly\[v[u/y\/x\. 

Lemma 4. For any pure terms a, a' G U and v G if a a' then 
a[v/x] a'[v/x\. 



A Direct Proof of Strong Normalization for an Extended Herbelin’s Calculus 



249 



Table 3. Translations S' and 0 



=def a;[] 

<P{MN) =def :: []) 

<I'{Xx.M) =def Xx.<P{M) 



&{xl) =def 0'{X,1) 
0{Xx.t) =def Xx.O{t) 



0{{Xx.t){u :: 1)) =def o' (Xx.O(t),u :: 1) 



6)'(M,0) =def M 



0'{M,u :: 1) =def 0'{M0{u),l) 



Translations between pure terms and ordinary A-terms are given in [11] and 
[10] through a grammar of A-terms that is different from the usual one. Here we 
define translations between A-terms with the usual grammar and pure terms as 
shown in Table 3. 

Proposition 1. 0 o tp = id and W o 0 = id. 

Theorem 1. 

1. For any X-terms M,M' , if M — s-/? M' then F{M) — F{M'). 

2. For any pure terms t,t' G ift — t' then 0{t) — *-/3 0{t'). 

It is also possible to show that the translations preserve the types of terms, 
defining translations on typing derivations. Later we see that /3-reduction on 
pure terms can be simulated by Ax-reduction, thus showing how to simulate 
normalization in natural deduction by cut-elimination in Herbelin’s sequent cal- 
culus. 

4 Properties of the Subcalculus x 

In this section we study properties of the subcalculus x which is obtained from 
Ax-calculus by deleting the Beta-rnle. In the typed case it corresponds to cut- 
elimination steps except the key-case. We show that the subcalculus is strongly 
normalizing and confluent and that its normal forms are pure terms. 

Proposition 2. The subcalculus x is strongly normalizing. 

Proof. The proof is by interpretation, following Appendix A of [9]. We define a 
function h : U — > N as follows: 



h{xl) =def h{l) + 1 

h{Xx.t) =def h{t) + 1 

h(tl^ = def 2 X h(t^ hiVj -\- 1 

h{t{z := v)) =def h{t) X (3 X h{v) + 1) 
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^(D) —def 1 

h{u::l) = def h{u) + h{l) + 1 

h{lV) =def 2 X h{l) + h{V) + 1 

h{l{z := v)) =def h{l) X (3 X h{v) + 1) 

and observe that if a — >x b then h{a) > h{h). □ 

Proposition 3. The subcalculus x is confluent. 

Proof. By Newman’s Lemma, it suffices to check the local confluence. The proof 
follows Appendix B of [9]. □ 

As a result, we can define the unique x-normal form of each Ax-term. 

Definition 1. Let a G U The unique x-normal form of a is denoted by 
x(a). 

Proposition 4. Let a € U a is a pure term iff a is in x-normal form. 

Proof. The only if part is by induction on the structure of pure terms. We prove 
the if part by induction on the structure of a. Suppose that a is in x-normal 
form. Then by the induction hypothesis, all subterms of a are pure. Here if a 
is not pure then a is one of the forms tl{^ {Xx.t){u :: 1)), t{x := v), W and 
l{x := v) where t, I, v, I' are pure. In any case we see that a is an x-redex, which 
is a contradiction. □ 

The next proposition shows that the subcalculus x correctly simulates the 
meta-operations on pure terms. 

Proposition 5. Let t, v, I, V he pure terms with t,v G and I, I' G Then 

1. W mi', 

2. l{y := v) A,, l[v/y], 

3. tl {t}l, 

4- t{y ■= v) t[v/y]. 

Proof. The first part is by induction on the structure of 1. The third part is by 
a case analysis according to the form of tl. The remaining two parts are proved 
by simultaneous induction on the structure of I or t. □ 

From the above proposition we have the following lemma which allows us 
to reduce inference on x-normal form to inference for meta-operations on pure 
terms. 

Lemma 5. Let t,v G Tjy, and I, V G Cj^. Then 

1. x{W) = x{l)@x{l'), 

2. x{l{y := v)) = x{l)[x{v)/y], 

3. x{tl) = {x(t)}x(/), 

4. x{t{y := v)) = x{f)[x{v)ly]. 

Proof. We only consider the fourth part. Since x(t) and x(n) are pure terms, we 
have x(t)(y := x{v)) x(t)[x(?;)/y] by Proposition 5 (4). Hence, x{t{y := v)) = 

x{x{t){y := x(?;))) = x(x(t)[x(z;)/ 2 /]) = x{t)[x{v)/y]. □ 
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5 Simulation of /3-reduction 

Now we are in a position to show that Ax-reduction simulates /3-reduction. 

Theorem 2. For any pure terms a,b £ U if a — >/3 b then a b. 

Proof. By induction on the structure of a. We treat the case a = {Xx.t){u :: 1), 
b = {t[u/x]}l. Then use -^Beta to create t{x := u)l, and use Proposition 5 (4) 
and (3) to reach {t[u/x]}l. □ 

Since the translations in Section 3 preserve the types of terms, the proof 
of the above theorem indicates how to simulate normalization in natural de- 
duction by cut-elimination in Herbelin’s sequent calculus. Specifically, a redex 
in natural deduction is translated into the key-case corresponding to a Beta- 
redex {Xx.t){u :: 1). Then transformation is performed as in Figure 1 to create 
the proof corresponding to t{x := u)l, followed by cut-reduction steps to reach 
the proof corresponding to {t[u/x]}l. The latter cut-reduction steps are in fact 
strongly normalizing and confluent, since they correspond to reduction steps of 
the subcalculus x. 

The strictness in Theorem 2 has a nice consequence. 

Corollary 1. Let a € U Cj^. If a £ then x(a) G SAfp. 

Proof. Suppose x(a) ^ SAfp. Using Theorem 2 we get an infinite Ax-reduction 
sequence starting with x(a). Since a x(a) we have a ^ □ 

6 Main Lemma 

In this section we give an inductive proof of PSN for Ax-calculus with respect 
to /3-reduction on pure terms. Although PSN itself was already proved in [10] 
in a different way, our main lemma is also useful for the reducibility proof of 
strong normalization in the next section. We follow the method of [2,3,5] for 
Ax-calculus, in which the key notions are void reduction and decent terms. 

Definition 2. A substitution {x := v) is said to be void in a{x := v) if x ^ 
FV{x{a)). Void reduction is Xx-reduction inside the body of a void substitution 
(more precisely, it is the contextual closure of the reduction: a{z := v) — 
a{z := v') where v v' and z ^ FV(x.{a))). 

As in the case of Ax, we have the following lemmas. 

Lemma 6 (Projection). Let a,b £ U Cj^. 

1. If a b, then x{a) x{b). 

2. If a -^Beta b is not a void reduction, then x{a) x(&). 
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Lemma 7. If ao,ai, . . . G such that x(ao) G SJ\f p and ao — ai 

. . . is an infinite \x-reduction sequence, there is a k G^ such that for all i ^ k, 
Qi — *-Xx ®i+i void. 



Proof. Since — >x is strongly normalizing, we may assume that the infinite Ax- 
reduction sequence has the form ao -^x ai -^seta <12 03.... Now, by 

Lemma 6 ( 1 ), we have x(ao) x(o2) x{ai) x(o6) . . . , where by 

Lemma 6 ( 2 ) we have x(o2n) -^p ^{a2n+2) if «2n-i-i ^Beta a2n+2 is not void. 

Now, since x(ao) G SJ\fp, there is a j € N such that for all i ^ j, a2i+i -^Beta 

vi2i+2 is void. In what follows we prove that from some point onwards not only the 
Beta reductions are void but also the — >x reductions. This is done by defining 
an interpretation h on U 

h{xl) =def h{l) + 1 

h{\x.t) =def h{t) + 1 

if z G FV{x{t)) 
lizi FV(x(t)) 



if z G FV{x{l)) 
lizi FV{x{l)) 

One may then verify that: 

— if a b is void, then h{a) = h(b), and 

— if a — >x b is not void, then h{a) > h(b). 



h{tl) 
h{t{z := v)) =def 



— def 2 X h{t) + hil) + 1 
hit) X (3 X h{v) -\ 
h{t) X 4 



MD) 

h{u :: 1 ) 
hill') 



-def 

-def 



hUiz := v)) =def 



1 ) 



1 

h(tt) -I- hil) 1 
= def 2 X hil) hil') 1 

hil) X (3 X hiv) -\- 1 ) 
hil) X 4 



Thus there must he a, k > j such that for all i ^ fc we have that not only 
a2i-i-i — >Beta a2i-i-2 is void but also an ^x a2i-i-i- □ 



Definition 3 (Decent terms). Let a G ® decent if for 

every substitution iz := v) occurring in a, v G SAfjy.. 

The next proposition follows easily from the fact that void reduction takes 
place inside the body of a (void) substitution. 

Proposition 6. For decent terms, void reduction is strongly normalizing. 

Our aim is now to prove the converse of Corollary 1 when we restrict the 
Ax-terms in question to decent terms. Before we proceed to the proof we need 
one more lemma. 
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Definition 4. Let a he a pure term with a € SMp. maxred/3(a) is defined as 
the maximal length of all [3-reduction sequences starting from a. 

Lemma 8. Let a,b G U with x(a) G SAfp. Lfb is a subterm of a and is 
not inside the body of a substitution in a, then maxred^(x(6)) ^ maxred/3(x(a)). 

Proof. By induction on the structure of a. We treat the case a = t{z := v). If b 
is a strict subterm of a and is not inside the body of a substitution in a, then b 
is a subterm of t. Hence 

maxred/3(x(6)) ^ maxred/3(x(t)) (by the induction hypothesis) 

^ maxred/3(x(t)[x(i;)/z]) (by Lemma 4) 

= maxred/3(x(t(z := v))) (by Lemma 5 (4)) 

□ 



Lemma 9 (Main lemma). Lf a is decent and x(a) G SN p, then a G 

Proof. By induction on maxred/3(x(a)). Suppose that a is decent and that if b 
is decent and maxred/3(x(&)) < maxred/3(x(a)) then b G We first show 

that if a a' then a' is decent. If the reduction takes place inside the body 
of a substitution in a then clearly a' is decent. In what follows we show that for 
all subterms b of a, if the reduction b b' takes place outside the body of a 
substitution in a then b' is decent. This is proved by induction on the structure 
of b. We treat some cases. 

— b = tl, I V and b' = W . Then I' is decent by the induction hypothesis. 
Therefore b' is decent. 

— b = tl{z -.= v) and b' = t{z := v)l{z := v). Then all bodies of substitutions 
in b' are also bodies of substitutions in b. Hence b' is decent. 

— b = {Xz.t){u :: 1) and b' = t{z := u)l. In this case we crucially need the 
first induction hypothesis. All bodies of substitutions in f or / are also bod- 
ies of substitutions in b; we need to show that the new body of a sub- 
stitution, u, is in SAfj^ too. Now since m is a subterm of b, u is decent, 
and maxred/3(x(w)) < maxred/3((Az.x(t))(x(M) :: x(/))) = maxred/3(x(&)) ^ 
maxred/3(x(a)) by Lemma 8. Therefore, by the first induction hypothesis, 
u G 

Hence if a — a' then a' is decent. Moreover, if a then by Lemma 6 (1), 

x(a) x(a') and so maxred/3(x(a')) ^ maxred/3(x(a)), where if maxred/3(x(a')) 
< maxred/3(x(a)) then a' G SAfj^ by the first induction hypothesis, otherwise 
maxred^(x(a')) = maxred/3(x(a)) and we can apply the above argument to a' to 
show that if a' — a" then a" is decent. Thus we see that for any a' such that 
a a', a' is decent. 

Now suppose that a has an infinite Ax-reduction path. By Lemma 7 there 
is a term a' on this path such that from a' on all reductions are void. But we 
just proved that a' is decent, so we have a contradiction with Proposition 6. 
Therefore, a G □ 
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Corollary 2 (PSN). For any pure term a G U a G ijf a G SAf p . 

Proof. The only if part is by Corollary 1. Since pure terms are decent, we have 
the if part by Lemma 9. □ 

7 Strong Normalization 

In this section we prove that all typable Ax-terms are strongly normalizing. For 
this we use the reducibility method adapted to explicit substitution calculus and 
to Herbelin-style calculus. Here we consider reducibility sets only over and 
not over which is sufficient to prove our main result. 

Definition 5. For each type A, the set TZ^ is defined inductively as follows: 

=def {t G I yv G n^[t{v :: []) G 7^^]} 
where ip is a type variable. 

In the following, we abbreviate a Ax-term ti :: {t .2 :: ■ ■ ■ {tn [])•■■) to 
ti :: t 2 :: ■ ■ ■ :: tn :: []> and (. . . ((^^ 1 )^ 2 ) ■ ■ -)ln to U 1 I 2 ...In- 

Lemma 10. For every type A and every variable x, 

1 . 

2. If x{u\ Un :: []) € then (a:0)(Mi :: [])■■■ («« ” D) G TZ^. 

Proof. By simultaneous induction on the structure of A. 

i) H is a type variable p. 

1. By the definition of TZ‘^. 

2. Let x{ui Un :. []) G SMj^. Then (xO)(“i - [])...(«« :: []) 

is decent, and x{x{ui Un ■: [])) G SN p by Corollary 1. Since 

x((a;[])(Mi :: [])... :: [])) = x(x(mi Un :: [])), we have (x[])(mi :: 

[])■■■ {un ■■ []) G by Lemma 9. Hence (x[])(mi :: [])... (wn :: []) G 

IZ^. 

ii) A is of the form B D C. 

1. Let t G TZ^^^ . By the induction hypothesis for the second item, x[] G TZ^ 
and so t{x\\ :: []) G TZ^ . By the induction hypothesis for the first item, 
t{x[] :: []) G SAfj^. Hence t G SJVj^. 

2. Let x{ui ■.:■■■:: Un :■ []) G and let v G TZ^ . By the induction 

hypothesis for the first item, v G and so x{ui :: ■ ■ ■ :: Un v :: 

0) G By the induction hypothesis for the second item, (o;[])(mi :: 

[])... {un :: D)(^ ” D) G 7^c. Hence (x[])(wi :: [])..■ («n :: []) G if 

x{ui Un :: []) G SAfj^. □ 

Henceforth we use the result of Lemma 10 (1) without reference. 

The next two lemmas are essential to our reducibility proof in which the 
reducibility sets need to be closed under certain expansion with respect to Ax- 
reduction. 
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Lemma 11. Let x(s) = x{t) and t € 7Z^. If s is decent (in particular, if every 
substitution body in s is a subterm of t), then s G TZ^ . 

Proof. By induction on the structure of A. 

i) A is a type variable (p. Let x(s) = x(t) and t € TZP{C_ Then x(s) = 

x(t) e SM /3 by Corollary 1. If s is decent, then s G by Lemma 9, and 

hence s G TZ'^. 

ii) A is of the form B D C. We show that for any v G TZ^ , s{v :: []) G TZ'" . 

Suppose V G TZ^{<G SAfj^). Since t G TZ^^'" by assumption, we have t{v :: 
0) G TZ^. By Lemma 5 (3), x(s(u :: [])) = {x(s)}x(u :: []) = {x(t)}x('y :: []) = 
x{t(v :: [])). Since s and v are decent, so is s{v :: []). Hence by the induction 
hypothesis, we have s(n :: []) G TZ^ . □ 



Lemma 12. If {t{x := 'c))('Ui :: [])■•■ i^n ■■ []) G TZ^, then {Xx.t){v :: [])(ui - 

[])... K:: D)e7^^. 



Proof. By induction on the structure of A. 

i) A is a type variable ip. By Lemma 11, it suffices to show that if {t{x := 

v))(ui u„ :: []) G TZ"^ then (Ax.t)(v :: ui :: ■ ■ ■ :: Un ■■ []) G Sup- 
pose {t{x := v)){ui Un :: []) G TZ‘^{C Then t, v,ui, . . . ,Un G 

Here we can see that if {Xx.t){v :: ui :: ■ ■ ■ :: Un ■■ []) has an infinite 
reduction sequence then so does (t{x := v)){ui Un :: []) contradicting 

the hypothesis. Hence {Xx.t){v :: Ui :: • • • :: :: []) G n = TZ'^ . 

ii) A is of the form B D C. Easily seen by the induction hypothesis for C. □ 

Now we prove the reducibility lemma using the results we obtained so far. 

Lemma 13. 

1. Let X\ : A\, . . . ,Xn : An] — \- t : B and let Ui G TZ^' for each 1 ^ i ^ n where 
Xj ^ FV{ui) for all 1 ^ j ^ n. Then t{x\ := u\) . . . {xn ■= Un) G TZ^ . 

2. Let x\ : A\, . . . ,Xn ' An] B \- I : C and let Ui G TZ^' for each 1 ^ i ^ n 
where Xj ^ FV{ui) for all 1 ^ j ^ n. Then for any t G TZ^ , t{l{x\ := uf) . . . 
{xn := Un)) G TZ^ . 

Proof. Both items are proved simultaneously by induction on the structure of 
derivations. We treat some cases. Let F denote x\ : Ai, . . . , Xn ■ An. 

i) 

F]AG[]:A 

We show that for any t G TZ^, i([](^i •= "“i) • ■ • ■= Un)) G TZ^. Suppose 

t G TZ^{F SM^^. By assumption, we have Ui G 7Z"^'{C for each 

1 Gi i Gi n. Hence t([](a^i := u\) . . . {xn '.= Un)) is decent. Now we have 
x(t([](a;i := ui) . . . {Xn ■= Un))) = x(t[]) = {x(t)}[] = x{t) using Lemma 5 (3) 
and Lemma 1. Hence t(0(^Ci := mi) . . . {xn '.= Un)) G TZ^ by Lemma 11. 
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r,x A] A\- I : B 
r, X : A; — \- xl : B 

We show that 



(x/)(xi := Ml) . . 


. {xi := Ui){x := M)(xi+i : 


= Ui+i) . . . 


(x„ := Un) G TZ^ 


where u G TZ^ and 


Xj ^ FV (m) for all 1 < j 


^ n. By the induction hypoth 


esis, u{l{xi := ui ) . 
Now, 


. . (xi := Ui){x := m)(Xj+i 


:= Mj+i) . 


. {Xn := Un)) € 7Z^ 


(x/)(xi := Ml) . 


. {xi := Ui){x := M)(xi+i 


:= Mi+i) . . 


{Xn ■— U-n) 


^x(xZ(xi := Ml) . . 


(xi := Mi))(x := M)(xi+i 


:= Mi+i) . . 


{Xn ■— Un) 


^y,{ul{xi := Ml) . . 


(xi := Ui){x := u)){xi+i 


:= Mi+i) . . 


{Xn ■— Un) 


^xU{Xi-\.i . — Mj-i-i) 


. . . (x„ := Un)l{xi := Ml) 


. . (x := m) 


...{Xn := Un) 



and hence 

x((a;/)(a::i := Mi) . . . {xi := Ui){x := u){xi+i := M^+i) . . . (x„ := u„)) 

= x{u{x^+l := Mj+i) ...{Xn ■■= Un)l{xi '.= Ui) . . . {x := u) . . . {Xn := M„)) 

= {x(u)[x(mj+i)/xj+i]. . .[x(M„)/a::„]}x(/)[x(Mi)/xi]. . .[x(M)/a;]. . .[x(M„)/a;„] 

(by Lemma 5) 

= {x(u)}x(/)[x(ui)/a;i] . . . [x(u)/a;] . . . [x(m„)/x„] (by Lemma 2) 

= x(m(/(xi := Ml ) . . . {x := u) . . . (x„ := Un)))- (by Lemma 5) 

Therefore, by Lemma 11, we have 

{xl){xi := Ml) . . . {xi := Ui){x := m)(x*+i := m*+i) . . . (x„ := m„) S TL^ . 



iii) 



r;-hM:A r]B\~l-.C 
B] A Z:) B'r v:\l-.C 



L D 



We show that for any t € , t{{v :: l){xi := u\) . . . (x„ := m „)) G TZ'^ . 

Suppose t G TZ^^^ . By the induction hypothesis for the left premise, v{xi := 
Ml) . . . {xn '■= Un) G 7Z^, and so t{v{x\ := mi ) . . . (x„ := Un) []) G TZ^ . 
Then by the induction hypothesis for the right premise. 



{t{v{xi := Ui) ... {xn ■■= Un) " []))(^(a;i := ui) . . . (x„ := m „)) € TZ^ . 



Since x{t{{v :: l){xi := mi) . . . (x„ := m„))) = x{{t{v{xi := mi) . . . (x„ := 
Un) []))(^(a;i := mi) . . . (x„ := m„))), we have t{{v :: l){xi := ui) . . . (x„ := 
Un)) G by applying Lemma 11. 
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r,x A] — \- 1 ■. B 
B] — \- Xx.t ■. Ad B 



v) 



We show that {Xx.t){xi := ui) . . . (x„ := it„) G . Suppose v G TZ^. 

Then by the induction hypothesis, t{xi := u\) . . . {xn '■= Un){x := v) G 
TZ^ . By Lemma 12, {Xx.t{x\ := ui)...{xn '■= Un)){v :: []) G TZ^ . Hence 
Xx.t{x\ := ui) . . . {xn '■= Un) G TZ'^^^ , and by applying Lemma 11 we have 
{Xx.t){xi := Ui) .. . {xn ■= Un) G 7Z^^^. 



B; 



hv.A B,x : A;B h I : C 
B; B \- l{x := v) : C 



Cut2 



We show that for any t G TZ^ , t{l{x := v){xi := ui) . . . {xn ■= Un)) G TZ^ . 
Suppose t G TZ^ . By the induction hypothesis for the left premise, v{xi := 
Ml) . . . {xn '■= Un) G TZ^ . Then by the induction hypothesis for the right 
premise, t{l{xi := mi) . . . {Xn ■= Un){x := v{x\ := mi) . . . {xn '■= Un))) G TZ^ . 
Using Lemma 5 and the substitution lemma (Lemma 3) we have x(t(l{x := 
v){xi := Ml) . . . {xn ■■= Un))) = x{t{l{xi := Ml) . . . (x„ := Un){x := v{xi := 
Ml) . . . {xn '■= Un))))-, and hence t{l{x := v){x\ := mi) . . . {xn ■= Un)) G TZ'^ 
by Lemma 11. □ 



Theorem 3. 

L Let B; — \- t : B for some B and B. Then t G 
2. Let B; B \- I : C for some B, B and C . Then I G 

Proof. 1. Suppose B is {xi : Ai,...,Xn '■ An}. By Lemma 13 (1), t{xi := 
?/i[]) . . . {xn ■= UnW) G TZ^{C SAfj^) where each yi is fresh and yi\\ G TZ^' 
by Lemma 10 (2). Hence t G 

2. Similarly, {z[]){l{xi := yi[]) . . . {xn := yn\\)) G 7^‘^(C SN^^) where z is also 
fresh. Hence I G □ 

The above theorem shows that the cut-elimination procedure defined by the 
reduction rules of Ax-calculus is strongly normalizing. Also, by Corollary 1, we 
have strong normalization for typable pure terms with respect to /3-reduction, 
and thus strong normalization for typable terms in the usual A-calculus as well. 

8 Conclusion 

In this paper we presented a direct proof of strong normalization for the ty- 
pable terms of the extended Herbelin’s calculus. The main lemma was useful for 
both the inductive proof of PSN with respect to /3-reduction on pure terms and 
the reducibility proof of strong normalization for typable terms. Our reducibil- 
ity method seems to be helpful in investigating other reduction properties and 
semantical aspects of the calculus. 
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In the literature [13,16] there are reducibility methods for other calculi with 
explicit substitutions. The relationship between them and ours will be investi- 
gated in future work. 

Since the extended Herbelin’s calculus clarifies how to simulate /3-reduction 
by cut-elimination, it can be viewed as a basis for understanding computational 
meanings of various cut-elimination procedures. It would also be interesting to 
use the calculus for studies of integrating proof search and proof normalization. 
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Abstract. We show that the set-theoretic semantics of — simply 

typed lambda calculus with a boolean type but no type variables — is 
complete by inverting evaluation using decision trees. This leads to an 
implementation of normalization by evaluation which is witnessed by the 
source of part of this paper being a literate Haskell script. We show the 
correctness of our implementation using logical relations. 



1 Introduction 

Which is the simplest typed A-calciilus without uninterpreted base types or type 
variables? We suggest that the answer should be simply typed lambda 

calculus extended by a type of booleans Bool with True, False : Bool and 
IftuQUi : (T, given t : Bool and uo,ui : cr. The equational theory is given by 
the usual /Jry-equations of A^ and the following equations concerning Bool: 

If True Itg Ul =pr] Uq 
If False uo U\ =pr) 

If t True False =/3ri t 

n (If t UO Ml) =f3T] If t (m Mo) (m Ml) 

The equations are motivated by the categorical interpretation of Bool as a 
boolean object, i.e., an object Bool such that Hom(T x Bool, a) ~ Hom(T, a) x 
Hom(T, cr) (naturally in F and a). The calculus can thus be interpreted in any 
cartesian closed category with Bool (using the cartesian structure to interpret 
contexts) . 

The equational theory introduces some interesting equalities. E.g., consider 
once == 

thrice == xf°°^^^°°^Xx^°°^f (/ (/ x)) 

* partially supported by EU Framework 5 thematic networks TYPES and APPSEM 
II and the Estonian IT Foundation 

** partially supported by the Estonian Science Foundation under grant No. 5567 
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We observe that once =i 3 ri thrice. To see this, we note that, given / : Bool 
Bool, we have 

/(./(./True)) If (/True) (/(/True)) (/(/False)) 

=fjri If (,/ True) True (/ (/False)) 

=/ 3 r, If (/ True) True (if (/ False) (/ True) (/ False)) 
=pri If (/ True) True (if (/ False) False False) 

=l 3 ri If (/ True) True False 
=/3 t 7 / True 

Symmetrically, we can show that / (/(/False)) / False, and hence 



thrice 



= A/ 



Bool 



\ i’Bool 
= /3r) A/ 

\ rBool 

— pr) Aj 

\ .fBool 
= 137 ] A/ 



.Bool^^Bool^(/(^^)) 

•Bool^^Boolif ^ 

■Bool If ^(/ True) (/False) 



= once 



It is easy to see that once and thrice are equal in the standard semantics 
where Bool is interpreted by a two-element set Bool = {true, false} and function 
types are set-theoretic function spaces. We observe that there are only four 
elements in Bool ^ Bool = {s true, x i->- x,x ~^x, x false} and that for 
all the four / € Bool ^ Bool we have = f. 

May we use set-theoretic reasoning to prove equalities up to /Jry-convertibility? 
The answer is yes for A“*^, because for A“*^ we can invert set-theoretic evaluation 
of typed closed terms. That is: we can define a function quote‘s g I^lset ^ Tm <t 
such that t =i 3 ^ quote‘s Wset, for any t g Tm cr. Consequently, we get that, for 
any t,t' eJm a, t t' |t]set = [f'lset- 

The existence of quote also implies that is maximally consistent, i.e., 
identifying any two non-/3?7-convertible closed terms would lead to an inconsistent 
theory. This provides another justification for the specific choice of =/?r;- 

We do not analyze the normal forms, i.e. the codomain of nf and quote, here. 
However, the construction presented here, which is based on decision trees, leads 
to simple normal forms and we conjecture that this is the same set as the set of 
normal forms presented in [1,5] in the case Bool = 1-1-1. 



Haskell as a Poor Man’s Type Theory 

Our construction is entirely constructive, so it can be carried out, e.g. in Martin- 
Lof’s Type Theory, and we obtain an implementation of normalization nf'’^ t = 
quote'’ |t]set- We shall here use the functional language Haskell as a poor man’s 
Type Theory and obtain a Haskell program to normalize terms. 

Haskell hasn’t got dependent types (in particular inductive families), hence 
the Haskell types we are using are only approximations of their type-theoretic 
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correspondents. E.g., in Type Theory, we can introduce a type Tmr cr for the 
terms of type cr in context T, but in Haskell all such types get approximated by a 
type Tm that contains all untyped terms. Similarly, in Type Theory, we can have 
a type |cr]set for the set-theoretic denotation of each object-language type cr, so 
that, e.g., |cr > rjset = |cr]set — *■ [Tlset, but in the Haskell implementation we 
have to contend ourselves a mixed- variant recursive type El with a constructor 
SLam e Ty ^ (El ^ El) ^ El. 

We believe that this informal use of Type Theory is an effective way to 
arrive at functional programs which are correct by construction. However, we 
hope that in the future we can go further and bridge the gap between informal 
type theoretic reasoning and the actual implementation by using a dependently 
typed programming language such as the Epigram system, which is currently 
being developed by Conor McBride [12]. 

Related Work 

Inverting evaluation to achieve normalization by evaluation (NBE, aka. reduction- 
free normalization) was pioneered in [6] for simply typed lambda calculus with 
type variables and a non-standard semantics; a categorical account in terms of 
presheaves was given in [2]; this was extended to System F in [3,4]; see [9] for 
a recent survey on NBE. The completeness of the set-theoretic model in the 
presence of coproducts has been shown in [8] and our case arises as a special 
case when there are no type variables. Normalization procedures for typed A- 
calculus with coproducts can be found in [10,11] using rewriting techniques and 
[1,5] using NBE and sheaf theory. Both approaches allow type variables but do 
not handle the empty type. Here we present a much simpler construction for 
closed types using the simplest possible semantics of first-order simply typed 
A-calculi — the set-theoretic one — and also provide a concrete implementation of 
quote and nf whose correctness we show in detail. 

2 Implementation of 

The source of Sections 2 and 3 of this paper is a literate Haskell script imple- 
menting normalization for A“*^ and is available from 

http : / / WWW . cs . nott . ac . uk/~txa/publ/Nbe2 . Ihs 

We start by introducing types Ty € *, variables Var € *, typing contexts 
Con S * and untyped terms Tm € * of the object language by the following 
Haskell datatype definitions: 

data Ty = Bool I Ty :-> Ty 
deriving (Show, Eq) 

type Var = String 



type Con = [ (Var, Ty) ] 
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data Tm = Var Var 

I TTrue I TFalse I If Tm Tm Tm 
I Lam Ty String Tm I App Tm Tm 
deriving (Show, Eq) 

We view these recursive definitions as inductive definitions, i.e., we do not con- 
sider infinite terms. All the functions we define are total wrt. their precise type- 
theoretic types. 

Implementing typed terms Tm G Con — > Ty — > * would take inductive fam- 
ilies, which we cannot use in Haskell. But we can implement type inference 
infer e Con — > Tm — > Maybe Ty (where Maybe X = 1 + X as usual): 

infer : : Con -> Tm -> Maybe Ty 
infer gamma (Var x) = 

do sigma <- lookup x gamma 
Just sigma 

infer gamma TTrue = Just Bool 
infer gamma TFalse = Just Bool 
infer gamma (If t uO ul) = 
do Bool <- infer gamma t 
sigmaO <- infer gcunma uO 
sigmal <- infer gamirna ul 

if sigmaO == sigmal then Just sigmaO else Nothing 
infer gamma (Lam sigma x t) = 

do tau <- infer ((x, sigma) : gamma) t 
Just (sigma :-> tau) 
infer gamma (App t u) = 

do (sigma :-> tau) <- infer gamma t 
sigma’ <- infer gamma u 

if sigma == sigma’ then Just tau else Nothing 

This implementation is correct in the sense that t G Tm^ ct iff infers t = Just a. 

Evaluation of types |— ] G Ty ^ * is again an inductive family, which we 
cannot implement in Haskell, and the workaround is to have all |(t] coalesced 
into one metalanguage type el (of untyped elements) much the same way as all 
Tmr cr appear coalesced in Tm. We use a type class Sem to state what we require 
of such a coalesced type el: 

class Sem el where 
true : : el 
false : : el 

xif : : el -> el -> el -> el 
lam : : Ty -> (el -> el) -> el 
app : : el -> el -> el 

Evaluation of types |— ] G Ty ^ * naturally induces evaluation of contexts 
|— ] G Con ^ * (taking a context F to the type of environments for T), defined 

by 
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p e I-Tl d € |crl 

[] S IDl (x, d) : p G |(x, a) : T] 

In the Haskell code we approximate evaluation of contexts by a type Envez 
(of untyped environments): 

type Env el = [ (Var, el) ] 

Given t € Tm^ cr we define the evaluation of terms |t] € |G] ^ |<t]. In Haskell 
this is implemented as eval: 

eval : : Sem el => Env el -> Tm -> el 
eval rho (Var x) = d 

where (Just d) = lookup x rho 
eval rho TTrue = true 
eval rho TFalse = false 
eval rho (If t uO ul) = 

xif (eval rho t) (eval rho uO) (eval rho ul) 
eval rho (Lam sigma x t) = 

lam sigma (\ d -> eval ((x, d) : rho) t) 
eval rho (App t u) = app (eval rho t) (eval rho u) 

The standard set-theoretic semantics is given by 

|Bool]set = Bool 
[o' : ■r]set = [o’! set *■ ["^Iset 

This can be represented in Haskell as an instance of Sem: 
data El = STrue I SFalse I SLam Ty (El -> El) 

instance Sem El where 
true = STrue 
false = SFalse 
xif STrue d _ = d 
xif SFalse _ d = d 
lam = SLam 

app (SLam _ f) d = f d 

Since sets form a cartesian closed category with a boolean object, the set- 
theoretic semantics validates all /377-equalities. This is to say that |— ]set is equa- 
tionally sound: 

Proposition 2.1 (Soundness). If p G |T] and t t' G Tm^ cr, then 
Mset P = ft ]set P- 

Since all the sets we consider are finite, semantic equality can be implemented 
in Haskell, by making use of the function enum G {a G Ty) Tree |cr], which 
we will provide later: 
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instance Eq El where 

STrue == STrue = True 
SFalse == SFalse = True 
(SLam sigma f) == (SLaun _ f’) = 

and [f d == f ’ d I d <- flatten (enum sigma)] 

_ == _ = False 

Using on the same function we can also print elements of El: 

instance Show El where 

show STrue = "STrue" 
show SFalse = "SFalse" 
show (SLam sigma f) = 

"SLam " ++ (show sigma) ++ " " ++ 

(show [ (d, f d) I d <- flatten (enum sigma) ]) 

The equational theory of the calculus itself gives rise to another semantics — 
the free semantics, or typed terms up to /377-convertibility. This can be approx- 
imated by the following Haskell code, where a redundancy-avoiding version if' 
of If is used which produces a shorter but /377-equivalent term: 

if’ : : Tm -> Tm -> Tm -> Tm 
if, ^ TTrue TFalse = t 

if’ t uO ul = if uO == ul then uO else If t uO ul 

instance Sem Tm where 
true = TTrue 
false = TFalse 
xif = if’ 

lam sigma f = Larni sigma "x" (f (Var "x")) 
app = App 

We also observe that the use of a fixed variable is justified by the fact that our 
algorithm uses at most one bound variable at the time. A correct dependently 
typed version of the free semantics requires the use of presheaves to ensure that 
the argument to Lam is stable under renaming. We refrain from presenting the 
details here. It is well known that this semantics is equationally sound. 

3 Implementation of quote 

We now proceed to implementing quote e (cr G Ty) — *■ lajset ^ Tm ct. 

To define quote'’’^’’ we use enum'’’, which generates a decision tree whose 
leaves are all the elements of |cr], and questions'’^, which generates a list of ques- 
tions, i.e., elements of |cr] — > |Bool], based on answers to whom an element of 
|cr] can be looked up in the tree enum'’. (Since our decision trees are perfectly 
balanced and we use the same list of questions along each branch of a tree, we 
can separate this list of questions from the tree.) 
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Decision trees Tree S Ty — > ★ are provided by 

data Tree a = Val a I Choice (Tree a) (Tree a) deriving (Show, Eq) 

We will exploit the fact that Tree is a monad 

instance Monad Tree where 
return = Val 
(Val a) >>= h = h a 

(Choice 1 r) >>= h = Choice (1 >>= h) (r >>= h) 

(return and >>= are Haskell for the unit resp. the bind or Kleisli extension 
operation of a monad) and hence a functor 

instance Functor Tree where 

fmap h ds = ds >>= return . h 

(fmap is Haskell for the action of a functor on morphisms). 

It is convenient to use the function flatten which calculates the list of leaves 
of a given tree: 

flatten : : Tree a -> [ a ] 
flatten (Val a) = [ a ] 

flatten (Choice 1 r) = (flatten 1) ++ (flatten r) 

We implement enum'^ and questions'^ by mutual induction on cr S Ty. The 
precise typings of the functions are enum G (cr G Ty) ^ Tree |(t] and questions G 
(cr G Ty) — !■ [|cr] ^ |Boo1]]. As usual, Haskell cannot express those subtleties 
due to its lack of dependent types, but we can declare 

enum : : Sem el => Ty -> Tree el 
questions : : Sem el => Ty -> [ el -> el ] 

The base case is straightforward: A boolean is true or false and to know 
which one it is it suffices to know it. 

enum Bool = Choice (Val true) (Val false) 

questions Bool = [ \ b -> b ] 

The implementation of enum'’''^'^ and questions'’^'^’^ proceeds from the idea 
that a function is determined by its graph: to know a function it suffices to know 
its value on all possible argument values. The main idea in the implementation 
of enum'^'^''" is therefore to start with enum’’ and to duplicate the tree for each 
question in questions'’ using the bind of Tree: 

enum (sigma :-> tau) = 

fmap (lam sigma) (mkEnum (questions sigma) (enum tau)) 

mkEnum : : Sem el => [ el -> el ] -> Tree el -> Tree (el -> el) 
mkEnum [] es = fmap (\ e -> \ d -> e) es 
mkEnum (q : qs) es = (mkEnum qs es) >>= \ fl -> 

(mkEnum qs es) >>= \ f2 -> 

return (\ d -> xif (q d) (fl d) (f2 d) ) 
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questions'^'”*’" produces the appropriate questions by enumerating a and using 
questions from r: 

questions (sigma ;-> tau) = 

[ \ f -> q (app f d) I d <- flatten (enum sigma) , 

q <- questions tau ] 

As an example, the enumeration and questions for Bool > Bool return: 

Choice 

(Choice 



(Val 


(lam 


Bool 


(\ 


d -> 


xif 


d 


true 


true) ) ) 


(Val 


(larni 


Bool 


(\ 


d -> 


xif 


d 


true 


false) ) ) ) 


(Choice 

(Val 


(larni 


Bool 


(\ 


d -> 


xif 


d 


false 


true ) ) ) 


(Val 


(larni 


Bool 


(\ 


d -> 


xif 


d 


false 


false) ) ) ) 



resp. 

(\ f -> app f true : 

(\ f -> app f false : 

[])) 

We can look up an element in the decision tree for a type by ans'wering all 
the questions, this is realized by the function find belo'w. To define the domain 
of find precisely we define a relation bet'ween lists of ans'wers and decisions trees 
o C [A] X Tree B inductively: 

t & B aG A as o I as or 

[] o Val t a : as o Choice I r 

No'w given as € [|Bool]], ts G Tree |cr] such that asots we obtain find as ts G 
|cr], implemented in Haskell: 

find : : Sem el => [ el ] -> Tree el -> el 
find [] (Val t) = t 

find (a : as) (Choice 1 r) = xif a (find as 1) (find as r) 

We are no'w ready to implement quote'’ G Icrjset — > Tm cr, with Haskell typing 
quote : : Ty -> El -> Tm 

by induction on <t S Ty. As usual, the base case is easy: 

quote Bool STrue = TTrue 
quote Bool SFalse = TFalse 

quote'’'^’’ is more interesting: Our strategy is to map quote’ o / to the set- 
theoretic enum’’ and to then build a tree of If expressions by using the syntactic 
questions'’ in conjunction with the syntactic find: 
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quote (sigma :-> tau) (SLam _ f) = 

lam sigma (\ t -> find [ q t I q <- questions sigma ] 

(fmap (quote tau . f) (enum sigma))) 

(Notice that in Haskell it is inferred automatically which semantics is meant 
where.) 

As already discussed in the introduction, we implement normalization nf G 
(cr e Ty) ^Tma^Tmcrby 

nf : : Ty -> Tm -> Tm 

nf sigma t = quote sigma (eval [] t) 

Since we can infer types, we can implement nf^ G Tm — > Maybe (Do-sTyTm cr): 

nf ’ :: Tm -> Maybe (Ty, Tm) 
nf ’ t = do sigma <- infer [] t 

Just (sigma, nf sigma t) 

We test our implementation with the example from the introduction: 



b2b = Bool :-> Bool 

once = Lam b2b "f" (Larni Bool "x" (App (Var "f") (Var "x"))) 

twice = Lam b2b "f" (Larni Bool "x" (App (Var "f") 

(App (Var "f") (Var "x")))) 

thrice = Lam b2b "f" 

(Lam Bool "x" (App (Var "f") 

(App (Var "f") 

(App (Var "f") (Var "x"))))) 



and convince ourselves that (nf^ once = nf' thrice) = true but (nf' once = 
nf' twice) = false. Since semantic equality is decidable we do not actually have 
to construct the normal forms to decide convertibility. 

Since testing can only reveal the presence of errors we shall use the rest of 
this paper to prove that quote and hence nf behave correctly. 



4 Correctness of quote 

The main tool in our proof will be a notion of logical relations, a standard tool 
for the characterization of definable elements in models of typed lambda calculi 
since the pioneering work of Plotkin [13]. 

Let us agree to abbreviate Tmj] cr by Tm cr and |tlset[] by |t]set. 

Definition 4.1 (Logical Relations). We define a family of relations C 
Tm cr X |cr]set by induction on cr S Ty as follows: 

— tR^°°^b iff t =i 3 rj True and b = true or t False and b = false; 

— f iff, for all u,d, uWd implies App t uW f d. 

Note that R is not indexed by contexts, logical relations only relate closed terms. 
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We extend logical relations to contexts: We write Tm 7^ = |-T]syn for the type 
of closed substitutions. Now for F G Con we define C Tm T x |/^]set by: 

pR^p' tR'^d 

(x,t) (x,d) -.p' 

Logical relations are invariant under /Jry-equality. 

Lemma 4.2. IftR'^d and t =prj then t'R'^d. 

Logical relations obey the following Fundamental Theorem, a kind of sound- 
ness theorem for logical relations. 

Lemma 4.3 (Fundamental Theorem of Logical Relations). If 6R^ p and 

t G Tmr fjj then ^R^|fj50t p. In pavticulaT^ if t G~Im fj, then 

The main result required to see that quote is correct is the following lemma: 

Lemma 4.4 (Main Lemma). IftR^'d, then t =pri quote'^ d. 

The proof of this lemma is the subject of the next section. 

By correctness of quote we mean that it inverts set-theoretic evaluation of 
typed closed terms. 

Theorem 4.5 (Main Theorem). If t G Tm a, then t quote‘s Wset- 

Proof. Immediate from the Fundamental Theorem and the Main Lemma. □ 

The (constructive) existence and correctness of quote has a number of straight- 
forward important consequences. 

Corollary 4.6 (Completeness). If t,t' G Tm a, then |t]set = Plset implies 
t =p7] t . 

Proof. Immediate from the Main Theorem. □ 

From soundness (Proposition 2.1) and completeness together we get that =pri is 
decidable: checking whether t =pr) t' reduces to checking whether |t]s 0 t = |t']set, 
which is decidable as |— ]set is computable and equality in finite sets is decidable. 

Corollary 4.7. Ift,t' G Tm a, then t =pri t' iff quote'^ Wset = quote'^ Plset- 

Proof. Immediate from soundness (Proposition 2.1) and the Main Theorem. □ 

This corollary shows that nf^ = quote'^ o |— ]s 0 t : Tm ct ^ Tm ct indeed makes 
sense as normalization function: apart from just delivering, for any given typed 
closed term, some /Sr^-equal term, it is actually guaranteed to deliver the same 
term for t, t\ if t, t' are /Sr^-equal (morally, this is Church-Rosser for reduction- 
free normalization). 

Note that although we only stated completeness and normalization for typed 
closed terms above, these trivially extend to all typed terms as open terms can 
always be closed up by lambda-abstractions and this preserves / 3 ? 7 -equality. 
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Corollary 4.8. If t,t' S Tm a and [C] [(x, f)] =/?,, [C] [(x,t')] for every C € 
Tm[(x,cr)] Bool, then t =i 3 rj t' . Or, contrapositively, and more concretely, ift,t' G 
Tm (cti > ... an Bool) and t t' , then there exist Mi € Tm a\, 

. . . M„ G Tm an such that 

(App (. . . (App i Ml) . . .) M„) 7^ (App (. . . (App i' mi) . . .) m„) 

Proof. This corollary does not follow from the statement of the Main Theorem, 
but it follows from its proof. □ 

Corollary 4.9 (Maximal consistency). If t,t' G Tm a and t yipri t' , then 
from the equation t = t' as an additional axiom one would derive True = False. 

Proof. Immediate from the previous corollary. □ 

5 Proof of the Main Lemma 

We now present the proof of the main lemma which was postponed in the previ- 
ous section. To keep the proof readable, we write enumjet, questions^gt, findset to 
emphasize the uses of the set-theoretic semantics instances of enum, questions, 
find, while the free semantics instances will be written as enumsyn, questions^yp, 
findsyn. We use the fact that any functor F such as Tree has an effect on relations 
R <Z A X B denoted by FR C FA x FB, which can be defined as: 

p G F {{a,h) G A X B \ a R h} 

fmapfstp FR fmapsndp 

We first give the core of the proof and prove the lemmas this takes afterwards. 
Proof (of the Main Lemma). By induction on cr. 

— Case Bool: Assume tR^°°^b. Then either t =prj True and b = true, in which 
case we have 

t =prj True = quote®°°^ true = quote®°°^ b 
or t =pri False and b = false, in which case we have 
t =pr) False = quote®°°^ false = quote®°°^ b 

— Case a r: Assume for all m, d, mR'^c? implies App t uR^ f d. We 

have 

t =pT] Lam'^ X (App t (Var x)) 

=pT) (by Lemma 5.2 below) 

Lam'^ X (App t (findsy„ [q (Var x) | g ^ questions^yj enum^yj) 

= (by Lemma 5.3 below) 

Lam'^ X (findsyn [q (Var x) | g ^ questions^yn] 

(fmap (App t) enum^^yj) 
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=l3n (by Sublemma) 

Lam'^ X (findsyn [q (Var x) | g ^ questionsfyn] 

(fmap (quote"^ of) enum^g^)) 

= quote'^'”"'^ / 

The Sublemma is: 

fmap (App t) enum^yg (Tree =/3r?) fmap (quote"^ of) enum^g^ 

For proof, we notice that, by Lemma 5.1 (1) below for cr, 
enum^y,, (Tree enum^g^ 

Hence, by assumption and the fact that fmap commutes with the effect on 
relations 

fmap (App t) enum^yg (Tree R'^) fmap / enum^g^ 

Hence, by IH of the Lemma for r, 

fmap (App t) enum^yg (Tree =/3n) fmap (quote"^ of) enum^g^ 

□ 

The proof above used two lemmas. One is essentially free, but the other is 
technical. 

Lemma 5.1 (“Free” Lemma). 

1. enumfyg (Tree R"^) enum^g^. 

2 . questionsfyg [R'^ R®°°^] questionsfgt. 

Proof. The proof is simultaneous for (1) and (2) by induction on a. 

— Case Bool: Trivial. 

— Case cr :—>■ r: Proof of (1) uses IH (2) for a and IH (1) for r; proof of (2) 
uses IH (1) for cr and IH (2) for r. 

□ 



Lemma 5.2 (Technical Lemma). Fort G Tmr cr: 
t findsyn [qt\q<^ questions^yn] enumfyn 
Proof. By induction on cr. 

~ Case Bool: 

t = if^ t True False 

= if ' t (findsyn D True)) (findsyn [] (''^^l False)) 
= findsyn [f] (Choice (Val True) (Val False)) 

= findsyn [qt\q^ questions^yn^] enum®y°^ 
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— Case a > r: 

t =jSrj Laun'^ z (App t (Var z)) (z fresh wrt F) 

=f3r, (by IH for a) 

Lam'^ z (App t (findsyn [q (Var z) \ q ^ questions^yj enum^yj) 
=/3n (by Sublemma below) 

Lam'^ z 

(findsyn [q (App t u) \ u ^ flatten enum^y„,g ^ questionsJy„] 
(fmap {Xg g (Var z)) (mkenutrisyn questionsfy„ enumjyj)) 

= (by Lemma 5.3) 

findsyn [q (App t u) \ u ^ flatten enum^yn,g ^ questionsjyn] 

(fmap Lam'^ z (fmap {Xg g (Var z)) 

(mkenumsyn questions^y„ enumjyj)) 

= findsyn [q (App t u) \ u -f— flatten enum^yn, g ^ questionsjyn] 

(fmap {Xg Lam'^ z {g (Var z))) (mkenumsyn questionsfyn enumjyn)) 
=a findsyn [q (App t u) \ u ^ flatten enum^y„,g ^ questionsjyn] 

(fmap {Xg Lam'^ x {g (Var x))) (mkenumsyn questions^n s^umjyn)) 
= findsyn [qt\q^ questions^y^^] enum^y^^ 

The a-conversion step is justified by the easily verifiable fact that the leaves 
of mkenumsyn questions^yn enumjy^ are closed “terms with a hole” (which 
has the implication that mkenumsyn questions^yn enumjy„ G Tree (Tm^i a — *■ 
Tm/i r) is natural in A). 

The sublemma is: If u G Tm^ cr, qs G [Tm/i a — *■ Tm^i Bool], us G 
Tree (T m^ a) and qs o us, then 

App t (findsyn [q u \ q ■>— qs] us) 

=fSrj findsyn [g (App t u') I u' ^ flatten us, q v- questionsjyn] 

(fmap {Xg g u) (mkenumsyn qs enumjyn)) 

The proof is by induction on qs o us. 

• Case [] o Val u*: 

App t (findsyn [g M I g ^ []] (Valu*)) 

= App t u* 

=/ 3?7 (by IH of the Lemma for r) 

findsyn [g (App tu*)\q*^ questionsjyn] enumjyn 
= findsyn [g (App tu*)\q*^ questionsjyn] 

(fmap {Xg g u) (fmap {XvXu' v) enumjyn)) 

= findsyn [g (App t u') I u' ^ [u*],q ^ questionsjyn] 

(fmap {Xg g u) (mkenumsyn [] enum”n)) 
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Case q : qs o Choice I r: 

App t (findsyn [q' u\ q' ^ q : gs] (Choice I r)) 

= App t (if' {q u) (findsyn [q' u \ q' ^ gs] 1) 

(findsyn W u\q' ^ gs] r)) 

=/3t 7 if' {q u) (App t (findsyn W u\q' ^ gs] /)) 

(App t (findsyn W u\q' ^ qs] r)) 

=l 3 r] (by IH of the Sublemma for qs o I, qs o r) 
if' {q u) 

(findsyn W (App i u') \ u' ^ flatten l,q' ^ questionsjyn] 
(fmap {\g g u) (mkenumsyn qs enumjyn))) 

(findsyn [q' (App t u') \ u' ^ flatten r,q' ^ questionsjyn] 
(fmap {\g g u) (mkenumsyn qs enumjyn))) 

= findsyn D (Val (If' {q u) 

(findsyn [q' (App t u') \ u' ^ flatten l,q' ^ questionsjyn] 
(fmap {\g g u) (mkenumsyn qs enum[yn))) 

(findsyn [q (App t u) \ u ^ flatten r,q ^ questionSsyn] 
(fmap {\g g u) (mkenumsyn qs enumjyn))))) 

=j 3 ri (by twice Lemma 5.4) 

findsyn ([?' (App t u') I u' <— flatten l,q' <— questionSsyn] 
-H- ([g' (App t u) I u ^ flatten r, q ^ questionsjyn] 

■H" [])) 

((fmap {Xg g u) (mkenumsyn qs enumjyn)) >= Xvq 
(fmap {Xg g u) (mkenumsyn qs enumj^n)) Awi 
Val (If' {q u) Vo vi)) 

= findsyn ([?' (App t u) I u <— flatten l,q' ^ questionsjyn] 
-H- [q' (App t u') I u' flatten r, q' <— questionsjyn]) 
(fmap {Xg g u) 

((mkenumsyn qs enumjyn) »= Xgo 
(mkenumsyn qs enum lyn A51 
Val {Xu' If' {q u') {go u') {gi u')))) 

= findsyn [q (App t u)\t ^ (flatten 1) -H-(flatten r), 

q' ^ questionsjyn] 

(fmap {Xg g u) (mkenumsyn {q : qs) enumjyn)) 

= findsyn [<?' (App t u') I u' <— flatten (Choice I r), 

q' ^ questionsjyn] 

(fmap {Xg g u) (mkenumsyn {q '■ qs) enumjyn)) 
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We have used two lemmas, which are easy to prove: 

Lemma 5.3. If as G [A], us G Tree B and as o us, then 
findsyn as (fmap / us) f (findsyn as us) 

Proof. Simple induction on as o us. 

Lemma 5.4. If as,bs G [Tm/^ Bool], ts G Tree (Tmrcr), h G Tm/- cr — *■ 
Tree (T mr t), as o ts and, for all u G Trap cr, bs o h u, then 

findsyn {as -H-bs) {ts h) =j3rj findsyn bs {h (findsyn as ts)) 

Proof. By induction on as o ts. 

— Case [] o Val t: 

findsyn ([] -A-bs) ((Val t) h) 

= findsyn bs ((Val t) h) 

= findsyn bs {h t) 

= findsyn bs {h (findsyn [] (Val t)) 

— Case a : aso Choice I r: 

findsyn ((a : as) -H-bs) ((Choice I r) >•= h) 

= findsyn {a : {as 4+&s)) ((Choice I r) h) 

= If' a (findsyn (as P+bs) (/ »= h)) (findsyn (as G+bs) {r >= h)) 

= (by IH for as o I, as o r) 

If' a (findsyn bs {h (findsyn as 1))) (findsyn bs {h (findsyn as r))) 
=fSn findsyn bs {h (if' a (findsyn as 1) (findsyn as r))) 

= findsyn bs {h (findsyn (a : as) (Choice I r))) 

□ 



6 Discussion and Further Work 

Instead of decision trees we could have used a direct encoding of the graph of a 
function, we call this the truth-table semantics. However, this approach leads not 
only too much longer normal forms but also the semantic equality is less efficient. 
On the other hand it is possible to go further and use Binary Decision Diagrams 
(BDDs) [7] instead of decision trees. We plan to explore this in further work and 
also give a detailed analysis of the normal forms returned by our algorithm. 

We have argued that is the simplest A-calculus with closed types, how- 
ever we are confident that the technique described here works also for closed 
types in A°+^^^ (the finitary A-calculus). We leave this extension for a journal 
version of this work. 
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One can go even further and implement finitary Type Theory, i.e. 

(note that A + B = Ex &2. If x A B). This could provide an interesting base 
for a type-theoretic hardware description and verification language. 

The approach presented here works only for calculi without type variables. It 
remains open to see whether this approach can be merged with the the standard 
techniques for NBE for systems with type variables, leading to an alternative 
proof of completeness and maybe even finite completeness for the calculi dis- 
cussed above. 
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Abstract. We propose pattern matching calculi as a refinement of 
A-calculus that integrates mechanisms appropriate for fine-grained mod- 
elling of non-strict pattern matching. 

Compared with the functional rewriting strategy usually employed 
to define the operational semantics of pattern matching in non-strict 
functional programming languages like Haskell or Clean, our pattern 
matching calculi achieve the same effects using simpler and more local 
rules. 

The main device is to embed into expressions the separate syntactic cate- 
gory of matchings; the resulting language naturally encompasses pattern 
guards and Boolean guards as special cases. 

By allowing a confluent reduction system and a normalising strategy, 
these pattern matching calculi provide a new basis for operational 
semantics of non-strict programming languages and also for implemen- 
tations. 



1 Introduction 

The operational semantics of functional programming languages is usually ex- 
plained via special kinds of A-calculi and term rewriting systems (TRSs). One 
way to look at the relation between these two approaches is to consider A-calculi 
as internalisations of term rewriting systems: A-abstraction internalises applica- 
tive TRSs (where each function is defined in a single “equation” the left-hand 
side of which is an application of the function symbol to only variables), and 
fixedpoint combinators internalise recursive function definitions. 

In addition to these two features, modern functional programming languages 
support function definitions based on pattern matching. A pattern is an expres- 
sion built only from variables and constructors — in the context of applicative 
TRSs, constructors are function symbols that never occur as head of a rule. In the 
functional programming context, constructors are introduced by datatype defi- 
nitions, for example the list constructors “[]” (empty list) and (“cons”, 

non-empty list construction from head and tail). 

In the term rewriting view, a function is defined by pattern matching if it 
is defined by a group of rules, each having as left-hand side an application of 
the defined function symbol to patterns. Definitions using pattern matching are 
“processed sequentially”; for an example assume the following definition: 
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isEmptyList (x : xs) = False 
isEmptyList ys = True 

The second line is not an equation valid for arbitrary ys, but a rule that is only 
considered if the left-hand side of the first rule gives rise to a mismatch — here, 
the only value of ys for which this is the case is the empty list [] . 

Function definitions with patterns as arguments on the left-hand sides are 
typical of modern functional programming languages. In non-strict languages 
like Haskell, Clean, or Miranda, the operational semantics of pattern matching 
is quite complex; usually it is formulated as the functional rewriting strategy, 
which is a rather involved priority rewriting strategy [19, sect. 4.7.1]. 

In the case of, for example, Haskell, the operational semantics of pattern 
matching is defined via the special instance of pattern matching in case expres- 
sions. For isEmptyList, the above definition is considered as shorthand for: 

isEmptyList zs = case zs of (x : xs) -> False 

ys -> True 

Case expressions can be seen as an internalisation of pattern matching that is not 
quite analogous to the internalisation of function abstraction in A-calculus; the 
important difference is that, in comparison with A-abstractions, case expressions 
contain not only the abstracted pattern matchings, but also an additional appli- 
cation to an argument. To further complicate matters. Boolean guards and, more 
recently, pattern guards interfere with the “straightforward” pattern matching. 

In this paper we present a new calculus that cleanly internalises pattern 
matching by drawing a clearer distinction between the aspects involved. For 
that purpose, we essentially liberate the case expression from its rigidly built-in 
application, generalising the special syntactic category of case alternatives into 
the new syntactic category of matchings that incorporates all aspects of pattern 
matching, as opposed to the (preserved) syntactical category of expressions that 
now is mostly concerned with pattern construction and function application. 

This allows straightforward internalisation of pattern matching definitions 
without having to introduce new variables like zs for the case variant: 

isEmptyList = {| (a; : xs) 1=4> False] ys t=^ True]}- 

In addition, using the pattern matching calculus as basis of functional program- 
ming has advantages both for expressivity and reasoning about programs. 

With respect to reasoning, the full internalisation of pattern matching elimi- 
nates the problem of all priority systems that what is written down as an uncon- 
ditional equation only applies to certain patterns “left over” from higher-priority 
equations defining the same function. The usual justification for allowing this 
non-orthogonality is that otherwise the number of equations would explode. 
Our matching language allows direct transliteration of such prioritised defini- 
tions without additional cost, and even includes the means to factor out more 
commonalities than is possible in priority rewriting systems. The syntactical fea- 
tures necessary to achieve this turn out to be sufficient to include both Boolean 
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guards and pattern guards as special cases. This gives the language a boost in 
expressivity over systems directly based on term rewriting, and at the same time 
keeps the system simple and uniform. 

A noteworthy result is that by providing two variants of a simple rule con- 
cerned with results of matching failure, we obtain two interesting systems, both 
confluent and equipped with the same normalising strategy: 

— The first mirrors exactly the definition of pattern matching in, e.g., Haskell, 
which corresponds to the functional rewrite strategy modified by treating 
matching against non-covered alternatives as a run-time error. It is well 
known that the functional strategy, considered as a term rewriting strategy, 
is not normalising, so there are certain terms that, translated into our first 
system, have a normal form that corresponds to such run-time errors. 

— The second system is a refinement of the first in that it preserves all terminat- 
ing reductions not ending in a run-time errors, and also has such “successful” 
reductions for some terms that reduce to run-time errors in the first system. 

Similar mechanisms have been proposed in the literature, see Sect. 8; we feel 
that the setting of the pattern matching calculus helps to clarify the issues 
involved and provides an attractive environment for describing and analysing 
such alternative treatments of matching failure. 

After presenting the abstract syntax, we show how the pattern matching 
calculus encodes A-calculus and Haskell pattern matching including Boolean and 
pattern guards. Sect. 4 presents the reduction rules, which are applied to selected 
examples in Sect. 5. In Sect. 6 we summarise the mechanised confluence proof, 
and Sect. 7 is devoted to the normalising reduction strategy. Sect. 8 discusses 
related work. Details omitted here for reasons of space can be found in [10]. 

2 Abstract Syntax 

The pattern matching calculus, from now on usually abbreviated PMC, has two 
major syntactic categories, namely expressions and matchings. These are defined 
by mutual recursion. When considering the analogy to functional programs, only 
expressions of the pattern matching calculus correspond to expressions of func- 
tional programming languages. Matchings can be seen as a generalisation of 
groups of case alternatives. Operationally, matchings can be “waiting for argu- 
ment supply”, or they can be saturated', saturated matchings can succeed and 
then return an expression, or they can fail. Patterns form a separate syntactic 
category that will be used to construct pattern matchings. 

We now present the abstract syntax of the pattern matching calculus with 
some intuitive explanation of the intended meaning of the constructs. 

As base sets, we use Var as the set of variables, and Constr as the set of con- 
structors. For the purpose of our examples, numbers are assumed to be elements 
of Constr and are used only in zero-ary constructions (which are written without 
parentheses). Constructors will, as usual, be used to build both patterns and 
expressions. Indeed, one might consider Pat as a subset of Expr. 
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The following summarises the abstract syntax of PMC: 



Pat ::= Var 

I Constr(Pat, . . . , Pat) 



variable 

constructor pattern 



Expr ::= Var 



variable 

constructor application 
function application 
matching abstraction 
empty expression 



I Constr(Expr, . . . , Expr) 
I Expr Expr 
I H Match H 



0 



Match ::= ] Expr) 

I 



expression matching 
failure 

pattern matching 
argument supply 
alternative 



I Pat Match 
I Expr > Match 
I Match I Match 



Patterns are built from variables and constructor applications. All variables 
occurring in a pattern are free in that pattern; for every pattern p : Pat, we 
denote its set of free variables by FV(p). In the following, we silently restrict 
all patterns to be linear^ i.e., not to contain more than one occurrence of any 
variable. 

Expressions are the syntactic category that embodies the term construction 
aspects; besides variables, constructor application and function application, we 
also have the following special kinds of expressions: 

— Every matching m gives rise to the matching abstraction (matching expres- 
sion) H m H, which might be read “match m”. 

If the matching m is unsaturated, i.e., “waiting for arguments”, then {| m |} 
abstracts m into a function. 

If 771 is a saturated matching, then it can either succeed or fail; if it suc- 
ceeds, then H 771 H reduces to the value “returned” by m; otherwise, H tti |} is 
considered ill-defined. 

— we call 0 the empty expression; it results from matching failures — according 
to the above, it could also be called the “ill-defined expression” . 

We use the somewhat uncommitted name “empty expression” since we will 
consider two interpretations of 0: 

• It can be a “manifestly undefined” expression equivalent to non-termi- 
nation — following the common view that divergence is semantically 
equivalent to run-time errors. 

• It can be a special “error” value propagating matching failure ‘6', con- 
sidered as an “exception” through the syntactic category of expressions. 

None of the expression constructors binds any variables; we overload the FV(_) 
notation and denote for an expression e : Expr its set of free variables by FV(e). 

For the purposes of pattern matching, constructor applications of the same 
constructor, but with different arities, are considered incompatible. 
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Matchings are the syntactic category embodying the pattern analysis aspects: 

~ For an expression e : Expr, the expression matching ] e f always succeeds and 
returns e, so we propose to read it “return e”. 

— is the matching that always fails. 

~ The pattern matching p m waits for supply of one argument more than 
m; this pattern matching can be understood as succeeding on instances of 
the (linear) pattern p : Pat and then continuing to behave as the resulting 
instance of the matching m : Match. It roughly corresponds to a single case 
alternative in languages with case expressions. 

— argument supply a > m is the matching-level incarnation of function appli- 
cation, with the argument on the left, and the matching it is supplied to on 
the right. It saturates the first argument m is waiting for. 

The inclusion of argument supply into the calculus is an important source 
of flexibility in the design of the reduction system. 

— the alternative m\ \ m 2 will in this paper be understood sequentially: it be- 
haves like mi until this fails, and only then it behaves like m 2 . 

Pattern matching p m binds all variables occurring in p, so FV(p 1=4> m) = 
FV(m) — FV(p), letting FV(m) denote the set of free variables of a matching m. 
Pattern matching is the only variable binder in this calculus — taking this into 
account, the definitions of free variables, bound variables, and substitution are 
as usual. Note that there are no matching variables; variables can only occur as 
patterns or as expressions. 

We will omit the parentheses in matchings of the shape a c> (p 1=4> m) since 
there is only one way to parse a>pl=4>minPMC. 

3 Examples 

Even though we have not yet introduced PMC reduction, the explanations of 
the syntax of PMC in the previous section should allow the reader to understand 
the examples presented in this section. We first show the natural embedding 
of the untyped A-calculus into PMC and then continue to give translations for 
Haskell function definitions first using pattern matching only, then together with 
Boolean guards and finally together with pattern guards. 

It is easy to see that the pattern matching calculus includes the X-calculus. 
Variables and function application are translated directly, and A-abstraction is a 
matching abstraction over a pattern matching that has a single- variable pattern 
and a result matching that immediately returns the body: 

A w.e:={|?;l=4>1et|} 

In Sect. 5 we shall see that this embedding also preserves reducibility. 

As an example for the translation of Haskell programs into PMC, we show one 
that also serves as an example for non-normalisation of the functional rewriting 
strategy; with this program and the additional definition bot = bot, the func- 
tional strategy loops (detected by some implementations) on evaluation of the 
expression f bot (3: []), although “obviously” it “could” reduce to 2: 
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f (x:xs) [] =1 

f ys (v:vs) = 2 

For translation into PMC, we have to decide how we treat bot. We could translate 
it directly into an application of a fixedpoint combinator to the identity function; 
if we call the resulting expression _L, then _L gives rise to cyclic reductions. In 
this case, we obtain for f bot (3: [] ) the following expression: 

•d {{x : xs) 0 -llQKys {v : vs) 12^ [} -L (3 : []) 

A different possibility is to recognise that the above “definition” of bot has as 
goal to produce an undefined expression; if the empty expression 0 is understood 
as undefined, then we could use that. 

We will investigate reduction of both possibilities below, in Sect. 5. 

In several functional programming languages. Boolean guards may be added 
after the pattern part of a definition equation; the failure of such a guard has the 
same effect as pattern matching failure: if more definition equations are present, 
the next one is tried. For example: 

g (x:xs) I X > 5 = 2 
g ys =3 

Translation into a case-expression turns such a guard into a match to the Boolean 
constructor True and a default branch that redirects mismatches to the next line 
of the definition. In PMC, we do not need to make the mismatch case explicit, 
but can directly translate from the Haskell formulation. The above function g 
therefore corresponds to the following PMC expression: 

H ((r : xs) (a; > 5) >Truel=4> 12t)|(ys l=4> 13f) |}- 

A generalisation of Boolean guards are pattern guards [6] ; these incorporate 
not only the decision aspect of Boolean guards, but also the variable binding as- 
pect of pattern matching. In PMC, both can be represented as saturated patterns, 
i.e., as pattern matchings that already have an argument supplied to them. For 
a pattern guard example, we use Peyton Jones’ clunky: 

clunky env vl v2 I Just rl <- lookup env vl 

, Just r2 <- lookup env v2 = rl + r2 
I otherwise = vl + v2 

We attempt analogous layout for the PMC expression corresponding to the func- 
tion clunky (with appropriate conventions, we could omit more parentheses): 

H env [=> W 2 {{lookup env v\ > Just{ri) |=4> 

lookup env V 2 t> Just{r 2 ) 1 0 -|- r 2 f) 

|1tl-PW2f )|} 
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Irrefutable patterns, in Haskell indicated by the prefix match lazily, i.e., 
matching is delayed until one of the component variables is needed. There are 
no special provisions for irrefutable patterns in PMC; they have to be trans- 
lated in essentially the same way as in Haskell. For example, with body possibly 
containing occurrences of x and xs, the definition: 
q ~(x:xs) = body 

expands into the following shape according to the Haskell report: 

q = \ V -> (\x -> \ xs -> body ) (case v of (x:xs) -> x ) 

(case V of (x:xs) -> xs ) 

In PMC, we can turn the two function applications into saturated patterns: 

<7 = {|wl= 4 >{|u>( 2 ::a:s)|=^l 2 :t [j-Oa; 

H u 0 (a: : a:s) "I a:s f |} t> a:s body |} 

4 Standard Reduction Rules 

The intuitive explanations in Sect. 2 only provide guidance to one particular way 
of providing a semantics to PMC expressions and matchings. In this section, we 
provide a set of rules that implement the usual pattern matching semantics of 
non-strict languages by allowing corresponding reduction of PMC expressions as 
they arise from translating functional programs. In particular, we do not include 
extensionality rules. 

Formally, we define two redex reduction relations: — > : Expr <-+ Expr for 
expressions, and — > : Match Match for matchings. These are the smallest 

M 

relations including the rules listed in the sections 4.1 to 4.3. In 4.4 we shortly 
discuss the characteristics of the resulting rewriting system. 

We use the following conventions for metavariables: w is a variable; a, a\, 
0 / 2 , . . ., b, e, ei, 62 , . ■ ., / are expressions; k, n are natural numbers; c, d are 
constructors; m, mi, m 2 , ... are matchings; p, pi, p 2 , ■ ■ ■, q are patterns. 

4.1 Failure and Returning 

Failure is the (left) unit for | ; this enables discarding of failed alternatives and 
transfer of control to the next alternative: 

^\m — > m (‘^D 

M 

A matching abstraction where all alternatives fail represents an ill-defined case 
— this motivates the introduction of the empty expression into our language: 

^ 0 

Empty expressions are produced only by this rule; the rules (0@) and (0 t> c) 
below only propagate them. 
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Expression matchings are left-zeros for | : 

leflm ^ 1ef 



(Ifl) 



Matching abstractions built from expression matchings are equivalent to the 
contained expression: 






(^im 



4.2 Application and Argument Supply 

Application of a matching abstraction reduces to argument supply inside the 
abstraction: 



H a > m H 






Argument supply to an expression matching reduces to function application 
inside the expression matching: 

flOlef — > ]e a\ (>U) 

M 

No matter which of our two interpretations of the empty expression we choose, 
it absorbs arguments when used as function in an application: 

( 0 @) 



0 e 



0 



Analogously, failure absorbs argument supply: 
e > 

M 

Argument supply distributes into alternatives: 
et>{mi\m 2 ) — *■ (e > mi) | (e > m 2 ) 



(>l) 



4.3 Pattern Matching 

Everything matches a variable pattern; this matching gives rise to substitution: 

a\>v\=^m — > m[w\al (>?;) 

M 

Matching constructors match, and the proviso can always be ensured via a- 
conversion (for this rule to make sense, linearity of patterns is important): 



c(ei, . . . , e„) > c(pi, . . . ,p„) l=4> m — > ei c> • • • e„ c> m 

M 

if FV(c(ei, . . . , e„)) n FV(c(pi, . . . ,p„)) = {} (c> c) 

Matching of different constructors fails: 

d(ei, . . . , Cfc) > c(pi, . . . ,p„) l=l> m — > 4’ if c d or k n (dt>c) 

M 

For the case where an empty expression is matched against a constructor pattern, 
we consider two different right-hand sides: 



284 



Wolfram Kahl 



— The calculus PMC 0 interprets the empty expression as equivalent to non- 
termination, so constructor pattern matchings are strict in the supplied ar- 
gument: 

0 > c(pi, . . . ,p„) m > l 0 t ( 0 >c^ 0 ) 

M 

— The calculus PMC<=, interprets the empty expression as propagating the ex- 
ception of matching failure, and “resurrects” that failure when matching 
against a constructor: 

0 > c(pi, . . . ,p„) m > ( 0 >c^‘^) 

M 

For statements that hold in both PMC0 and PMC<=,, we let the rule name ( 0 >c) 
stand for the rule (0 c> c ^ 0 ) in PMC 0 and for (0 c> c ^ ^') in PMC^. 



4.4 Rewriting System Aspects 

For a reduction rule (i?), the one-step redex reduction relation defined by that 
rule is written > ; this will either have only expressions, or only matchings 

(ij) 

in its domain and range. Furthermore, we let — e — > be the one-step reduction 

(R) 

relation closed under expression and matching construction. 

Each of the rewriting systems PMC 0 and PMC. 5 . formed by the reduction 
rules introduced in sections 4.1 to 4.3 consists of nine first-order term rewriting 
rules, two rule-schemata ( 0 [>c) and (di>c) — parameterised by the constructors 
and the arities — that involve the binding constructor 1=4>, but not any bound 
variables, the second-order rule (>v) involving substitution, and the second-order 
rule schema (c > c) for pattern matching that re-binds variables. 

The substituting rule (> 11 ) has almost the same syntactical characteristics as 
/3-reduction, and can be directly reformulated as a CRS rule. (CRS stands for 
combinatory reduction system [11,17].) 

The pattern matching rule schema (c > c) involves binders binding multiple 
variables, but its individual rules still could be reformulated as CRS rules. 

The whole system is neither orthogonal nor does it have any other properties 
like weak orthogonality for which the literature provides confluence proofs; we 
describe a confluence proof in Sect. 6 . 

5 Reduction Examples 

For the translation of A-calculus into PMC it is easy to see that every /3-reduction 
can be emulated by a three-step reduction sequence in PMC: 

(A ?;.e)a = -{|?;f4>1etD-a > {| a > ?; ] e f D — e — > H 1 e f [?;\a] |} 

(-H DO) (>») 

= {|1eb\a]r|} e[w\a] 
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By induction over A-terms and P M C-reductions starting from translations of A- 
terms one can show that such reductions can never lead to PMC expressions con- 
taining constructors, failure, 0, or alternatives, and can only use the four rules 
(H H®)) (H 1 \ |[): (>1 1"). Of these, the first three make up the translation 

of /3-reduction, and the last can only be applied to “undo” the effect of a “pre- 
mature” application of ({| |}@). In addition, for each PMC reduction sequence 
starting from the translation of a A-term t, we can construct a corresponding /3- 
reduction sequence starting from t showing that the only real difference between 
arbitrary PMC reduction sequences of translations of A-terms and /3-reduction 
sequences is that the PMC reduction sequence may contain steps corresponding 
to “unfinished” /3-steps, and “premature” ({| |}@) steps. 

Therefore, no significant divergence is possible, and confluence of the stan- 
dard PMC reduction rules, to be shown below, implies that this embedding of 
the untyped A-calculus is faithful. 

For the PMC translation of the Haskell expression f hot (3: []), the nor- 
malising strategy we present below produces the following reduction sequence: 

•d {{x : xs) 0 -llQKys {v : vs) 12f) [} -L (3 : []) 

— ^ {|-L> (((a: : a:s) [] 1 1 f) | (//s (w : vs) 1=^ 12t)) [j- (3 : []) 

(J to) 

{|(3 : [])>_L>((( 2 : : xs) [] ) 1 f) | (ys (w : ws) 120)[) 

01 to) 

— ^ {1(3 : []) > ((-Li> (a; : xs) [] 11t)l(-Li>//s (?; : ds) 12f)) |} 

(>l) 

From here, reduction would loop on the vain attempt to evaluate the first oc- 
currence of _L. If we replace _L with the empty expression 0. then we obtain 
different behaviour according to which interpretation we choose for 0: 

In PMC 0 , the empty expression propagates: 



H (3 : []) > ((0> (a: : xs) [] 1 1 0 I (0 t> ys 1=^ (w : vs) 1=^ )2f)) |} 



° > H (3 : []) > (1 0 f|(0 >ys (v : vs) 12f)) |) 

(0>c^0) 

^(3:[])>l0r^ ^10 (3:[])r^ 0 (3 : []) 



> 0 

(0O) 



In PMC 0 , the empty expression 0 is like a runtime error: it terminates reduction 
in an “abnormal” way, by propagating through all constructs like an uncaught 
exception. In PMC^,, however, this exception can be caught: matching the empty 
expression against list construction produces a failure, and the other alternative 
succeeds: 



•d (3 : []) > ((0 0 (x : xs) 0 1lf)l(0t>ys (w : vs) l=^ )2f)) |} 

° > {|(3 : [])i>(^|(0i>ysl=^ (w : us)l^12r))|} 

— e — > H (3 : []) > 0 > ys (?) : ws) 12|" H 
(‘=’1) 

— ® ^ H (3 : []) > (w : ws) 12) D — e — > {|3[>ul=^[]>?;sN4>12)|} 

(fr.;) (Oe) 

•dO>wsM>12)[) — ^ {|12t[) — ^ 2 

(>») (I u D) 



-e — 

Ot;) 
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It is not hard to see that reduction in PMC cannot arrive at the result 2 in the 
example with _L, even if the second alternative can, for example after the third 
step of the original sequence, be reduced to 12|": If a pattern matching p m 
has been supplied with an argument a, then in the resulting a > p |=4> m, the 
matching m can be considered as guarded by the pattern guard “at>p” (in abuse 
of syntax). An alternative can only be committed to if all its pattern guards 
succeed] discarding — ultimately via (•^’1) — an alternative with only non- 
variable patterns and arguments for all patterns only works if its first pattern 
guard can be determined as mismatching. In PMC 0 , this can only be via (d > c); 
while in PMC^, it could also be via (0 > c ^ 

6 Confluence and Formalisation 

Just among the first-order rules, four critical pairs arise: where the matching 
delimiters | and H [}• on the one hand are eliminated by failure or expression 
matchings (e), and on the other hand are traversed by argument supply. None 
of these critical pairs is resolved by single steps of simple parallel reduction. It 
is easy to see that a shortcut rule, such as {|a[>1ef|}-^ e a, immediately gives 
rise to a new critical pair that would need to be resolved by a longer shortcut 
rule, in this case {|&[>a>1e|'[}- — > e a b. 

A more systematic approach than introducing an infinite number of such 
shortcut rules is to adopt Aczel’s approach [1] to parallel reduction that also 
reduces redexes created “upwards” by parallel reduction steps. Confluence of 
PMC reduction can then be shown by establishing the diamond property for the 
parallel reduction relations. 

Using a formalisation in Isabelle-2003/Isar/HOL [15], I have performed a 
machine-checked proof of this confluence result.^ Since both de Bruijn indexing 
and translation into higher-order abstract syntax would have required consider- 
able technical effort and would have resulted in proving properties less obviously 
related to the pattern matching calculus as presented here, I have chosen as basis 
for the formalisation the Isabelle- 1999/HOL theory set used by Vestergaard and 
Brotherston in their confluence proof for A-calculus [22]. This formalisation is 
based on first-order abstract syntax and makes all the issues involved in variable 
renaming explicit. Therefore, the formalisation includes the rules as given in 
Sect. 4 with the same side-conditions; only the formalisation of the substituting 
variable match rule (>u) has an additional side-condition ensuring permissible 
substitution in analogy with the treatment of the /3-rule in [22]. 

Vestergaard and Brotherston employed parallel reduction in the style of the 
Tait/Martin-L6f proof method, and used Takahashi’s proof of the diamond prop- 
erty via complete developments. For PMC, we had to replace this by the Aczel- 
style extended parallel reduction relations, and a direct case analysis for the 
diamond property of these relations. 

Due to the fact that we are employing two mutually recursive syntactic cat- 
egories (in Isabelle, the argument list of constructors actually counts as a third 

The proof is available at URL: http://www.cas. mcmaster.ca/'kahl/PMC/ 
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category in that mutual recursion), and due to the number of constructors of the 
pattern matching calculus (twelve including the list constructors, as opposed to 
three in the A-calculus), the number of constituent positions in these constructors 
(twelve — including those of list construction — versus three), and the num- 
ber of reduction rules (thirteen versus one), there is considerable combinatorial 
blow-up in the length of both the formalisation and the necessary proofs. 

7 Normalisation 

Since PMC is intended to serve as operational semantics for lazy functional 
programming with pattern matching, we give a reduction strategy that reduces 
expressions and matchings to strong head normal form (SHNF), see, e.g., [19, 
Sect. 4.3] for an accessible definition. With the set of rules defined in Sect. 4, the 
following facts about SHNFs are easily seen: 

— Variables, constructor applications, the empty expression 0, failure ■6’, ex- 
pression matchings ] e |", and pattern matchings p\=^ m are already in SHNF. 

— All rules that have an application / a at their top level have a metavariable 
for a, and none of these rules has a metavariable for f, so f a is in SHNF if 
/ is in SHNF and f a is not a redex. 

— A matching abstraction {| m |} is in SHNF if m is in SHNF, unless {| m [}• is a 
redex for one of the rules ({| or ({| 1 f |}). 

— Since all alternative rules have metavariables for m 2 , an alternative mil m 2 
is in SHNF if mi is in SHNF, unless mi | m 2 itself is a redex. 

— No rule for argument supply a t> m has a metavariable for m, and all rules 
for argument supply a > m that have non-metavariable a have m of shape 
c(pi, . . . ,Pn) 1=^ fn' ■ Therefore, if a > m is not a redex, it is in SHNF if m is 
in SHNF and, whenever m is of the shape c{pi , . . . , Pn) m', a is in SHNF, 
too. 

Due to the homogenous nature of its rule set, PMC therefore has a deterministic 
strategy for reduction of applications, matching abstractions, alternatives, and 
argument supply to SHNF : 

— If an application / a is a redex, reduce it; otherwise if / is not in SHNF, 
proceed into /. 

~ For a matching abstraction {| m |}, if m is not in SHNF, proceed into m, 
otherwise reduce {| m |} if it is a redex. 

~ For an alternative mi|m 2 , if mi is not in SHNF, proceed into mi, otherwise 
reduce mi | m 2 if it is a redex. 

~ If an argument supply a t> m is a redex, reduce it (this is essential for the 
case where m is of shape mi |m 2 , which is not necessarily in SHNF, and (>|) 
has to be applied). Otherwise, if m is not in SHNF, proceed into m. If m is 
of the shape c{pi, . . . ,pn) t=^ m', and a is not in SHNF, proceed into a. 

Matching abstractions and alternatives are redexes only if the selected con- 
stituent is in SHNF — this simplified the formulation of the strategy for these 
cases. 

This strategy induces a deterministic normalising strategy in the usual way. 
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8 Related Work 

In Peyton Jones’ book [18], the chapter 4 by Peyton Jones and Wadler intro- 
duces a “new built-in value FAIL, which is returned when a pattern-match fails” 
(p. 61). In addition, their “enriched A-calculus” also contains an alternative con- 
structor, for which FAIL is the identity, thus corresponding to our failure 
However, FAIL only occurs in contexts where there is a right-most ERROR al- 
ternative (errors ERROR are distinguished from non-termination J_), so there is 
no opportunity to discover that, in our terms, {| FAIL]}- = ERROR. Also, errors 
always propagate; since ERROR corresponds to our empty expression 0, their 
error behaviour corresponds to our rule (0 > c — > 0). Wadler’s chapter 5, one of 
the standard references for compilation of pattern matching, contains a section 
about optimisation of expressions containing alternative and FAIL, arguing along 
lines that would be simplified by our separation into matchings and expressions. 

Similarly, Tullsen includes a primitive “failure combinator” that never suc- 
ceeds among his “First Class Patterns” combinators extending Haskell [21]. He 
uses a purely semantic approach, with functions into a Maybe type or, more 
generally, into a MonadPlus as “patterns”. In this way, he effectively embeds 
our two-sorted calculus into the single sort of Haskell expressions, with a fixed 
interpretation. However, since expressions are non-patterns, Tullsen’s approach 
treats them as standard Haskell expressions and, therefore, does not have the 
option to consider “resurrecting failures” as in our rule {(Z> > c Harrison 

et al. follow a similar approach for modelling Haskell’s evaluation-on-demand in 
detail [9]; they consider “case branches p -> e” as separate syntactical units — 
such a case branch is a PMC matching p |=4> e — and interpret them as functions 
into a Maybe type; the interpretation of case expressions translates failure into 
bottom, like in Tullsen’s approach. 

Erwig and Peyton Jones, together with their proposal of pattern guards, in 
[6] also proposed to use a Fail exception for allowing pattern matching failure 
as result of conventional Haskell expressions, and explicitly mention the possi- 
bility to catch this exception in the same or in another case expression. This 
is the only place in the literature where we encountered an approach somewhat 
corresponding to our rule (0 [> c ^ we feel that our formalisation in the 
shape of PMC^ can contribute significantly to the further exploration of this 
option. 

Van Oostrom defined an untyped A-calculus with patterns in [16], abstracting 
over (restricted) A-terms. This calculus does not include mismatch rules and 
therefore requires complicated encoding of typical multi-pattern definitions. 

Typed pattern calculi with less relation to lazy functional programming are 
investigated by Delia Kesner and others in, e.g., [3,8]. Patterns in these calculi 
can be understood as restricted to those cases that are the result of certain 
kinds of pattern compilation, and therefore need not include any concern for 
incomplete alternatives or failure propagation. 

As explained in the introduction, pattern matching can be seen as an inter- 
nalisation of term rewriting; PMC represents an internalisation of the functional 
rewriting strategy described for example in [19]. Internalisation of general, non- 
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deterministic term rewriting has been studied by H. Cirstea, C. Kirchner and 
others as the rewriting calculus, also called p-calculus [4,5], and, most recently, 
in typed variants as “pure pattern type systems” [2]. The /r?-calculus is param- 
eterised by a theory modulo which matching is performed; this can be used to 
deal with views [23]. Since the p-calculus allows arbitrary expressions as pat- 
terns, confluence holds only under restriction to call-by-value strategies. The 
p-calculus has unordered sets of alternatives that can also be empty; in [7] a 
distinction between matching failure and the empty alternative has been added 
for improving support for formulating rewriting strategies as p-calculus terms. 
Since matching failure in the p-calculus is an expression constant, it can occur 
as function or constructor argument, and proper propagation of failure must be 
enforced by call- by- value evaluation. 

Maranget [13] describes “automata with failures” as used by several com- 
pilers. Translated into our formalism, this introduces a default matching that 
never matches, but transfers control to the closest enclosing alternative contain- 
ing a wildcard (variable) pattern. This can be seen as closely related with our rule 
(0>c ^ Maranget-1994 used this feature as a way of allowing backtracking 
during pattern matching. In [12], this is extended to labelled exceptions, and 
can also be understood as a way of implementing sharing between alternatives. 



9 Conclusion and Outlook 

The pattern matching calculus PMC 0 turns out to be a simple and elegant 
formalisation of the operational pattern matching semantics of current non-strict 
functional programming languages. PMC 0 is a confluent reduction system with 
a simple deterministic normalising strategy, and therefore does not require the 
complex priorisation mechanisms of the functional rewriting strategy or other 
pattern matching definitions. 

In addition, we have shown how changing a single rule produces the new cal- 
culus PMC.^, which results in “more successful” evaluation, but is still confluent 
and normalising. Therefore, PMC^. is a promising foundation for further explo- 
ration of the “failure as exception” approach proposed by Erwig and Peyton 
Jones, for turning it into a basis for programming language implementations, 
and for relating it with Maranget ’s approach. 

The technical report [10] shows, besides other details, also a simple polymor- 
phic typing discipline, and the inclusion of an explicit fixedpoint combinator, 
which preserves confluence and normalisation. 

The next step will be an investigation of theory and denotational semantics 
of both calculi: For PMC 0 , the most natural approach will be essentially the 
Maybe semantics of pattern matching as proposed in [21,9]. For PMC<£,, the 
semantic domain for expressions needs to include an alternative for failure, too, 
to represent the semantics of empty expressions 0. 

We also envisage that pattern matching calculi would be a useful basis for an 
interactive program transformation and reasoning systems for Haskell, similar 
to what Sparkle [14] is for Clean. 
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Abstract. We present a method for automatic program inversion of 
functional programs based on methods of LR parsing. We formalize the 
transformation and illustrate it with the inversion of a program for run- 
length encoding. We solve one of the main problems of automatic pro- 
gram inversion — the elimination of nondeterminism — by viewing an in- 
verse program as a context-free grammar and applying to it methods 
of LR parsing to turn it into a recursive, deterministic inverse program. 
This improves the efficiency of the inverse programs and greatly expands 
the application range of our earlier method for program inversion. 



1 Introduction 

The contribution of this paper is an automatic method for program inversion 
based on methods of LR parsing. This transformation improves the efficiency of 
the inverse programs and extends the application range of our earlier method 
by allowing the inversion of programs based on a global transformation of a 
nondeterministic inverse program. We make use of a self-inverse primitive func- 
tion for the duplication of values and testing of equality, which we introduced in 
recent work [8], and a symmetric program representation to simplify inversion. 
To eliminate nondeterminism in a global view, we apply methods for LR pars- 
ing by viewing an inverse program as a context-free grammar and generating a 
deterministic inverse program if that grammar has certain properties (e.^., with- 
out parsing conflicts). This greatly expands the application range of our recent 
method for program inversion. 

The idea of program inversion can be traced back to reference [6]. Recent 
work [14] has focused on the converse of a function theorem [4], inverse com- 
putation of functional programs [2] , and the transformation of interpreters into 
inverse interpreters by partial evaluation [9]. Logic programming is suited to 
find multiple solutions and can be used for inverse interpretation, while in this 
paper we are interested in program inversion (for a detailed description of these 
notions, see reference [1]). We consider one-to-one functional programs and not 
relations with multiple solutions. An example is the generation of a program for 
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q 


~ dl...dn 


(program) 


d 


■— f{xi, ...,Xn)=t 


(definition) 


t 


■ = (/l, . . . , Im) 


(return) 




case 1 of {pi ti}YLi 


(case-expression) 




1 let (yi,...,t/m)=/(b,...,^n) int 


(let-expression) 


i 


:= X 


(variable) 




1 c(/l , . . . , /n) 


(constructor) 




1 L^J 


(duplication/equality) 


p 


:= c(xi, . . . ,Xn) 


(pattern) 



Fig. 1. Abstract syntax of the source language 

decoding data given a program for encoding data, and vice versa. In general, 
the goal of a program inverter is to find an inverse program q~^ : B —> A of a 
program q : A ^ B such that for all values x G A and y G B we have 

q{x) = y q~^{y) = X . 

This tells us that, if a program q terminates on input x and returns output y, 
then the inverse program q~^ terminates on y and returns x, and vice versa. 
This implies that both programs are injective] they need not be surjective or 
total. Here, equality means strong equivalence: either both sides of an equation 
are defined and equal, or both sides are undefined. In practice, even when it is 
certain that an efficient inverse program q~^ exists, the automatic generation of 
such a program from q may be difficult or impossible.^ 

The first method developed for automatic program inversion of first-order 
functional programs appears to be the program inverter by Korf and Epp- 
stein [13,7] (we call it KEinv for short). It is one of only two general-purpose auto- 
matic program inverters that have been built (the other one is InvX [12]). Manual 
methods [6,11,4,14] and semi-automatic methods [5] exist, but require ingenuity 
and human insight. Our goal is to achieve further automation of general-purpose 
program inversion. 

This paper is organized as follows. First, we define the source language 
(Sect. 2). Then we discuss our solution of the main challenges of program in- 
version (Sect. 3) and present our inversion method (Sect. 4). We discuss related 
work (Sect. 5), and then give a conclusion (Sect. 6). We assume that the reader 
is familiar with the principles of LR parsing, e.g., as presented in [3]. 

2 Source Language 

We are concerned with a first-order functional language. A program g is a se- 
quence of function definitions d where the body of each definition is a term t 



^ There exists a program inverter that returns, for every q, a trivial inverse q ^ [1]. 
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pack{s) — case s of [] — > ([]) 

c:r ^ let {p,l)=len{c,0,r) in (p-.pack(l)) 
len{c, n, s) = case s of [] ^ ((c, n), []) 

d'.r —>■ case [{c, d)J of 

(e) ^ let {p,t)=len{e,S{n),r) in (p,t) 
{e,f) {{e,n),f:r) 



Fig. 2. Program pack 

constructed from variables, constructors, function calls, case- and let-expressions 
(Fig. 1 where to > 0, n > 0). For reasons of symmetry, functions may return 
multiple output values which is denoted by syntax (/i, . . . , 1^)- Arity and coar- 
ity of functions and constructors are fixed. The language has a call- by- value 
semantics. A value v in the language is a constructor c with arguments wi, 

V ::= c{vi ...Vn) ■ 

An example is the program for run-length encoding (Fig. 2): function pack 
encodes a list of symbols as a list of symbol-number pairs, where the number 
specifies how many copies of a symbol have to be generated upon decoding. For 
instance, pacfc ([AABCCC]) = [(A,2)(B, 1)(C,3)].^ Function pack maximizes the 
counter: we never have an encoding like (C,2)(C,1), but rather, always (C,3). 
This implies that the symbols in two adjacent symbol- number pairs are never 
equal. Fig. 3 shows the inverse function pack~^ . In the implementation, we use 
unary numbers where O denotes One and S the Successor. The primitive function 
[•J checks the equality of two values: [(f,?;')J = {v) if z; = v' . In the absence 
of equality, the values are returned unchanged: = {v,v') if v ^ v' . This 

will be defined below. 

We consider only well-formed programs. As usual, we require that no two 
patterns pi and pj in a case-expression contain the same constructor and that 
all patterns are linear (no variable occurs more than once in a pattern) . We also 
require that each variable be defined before its use and, for simplicity, that no 
defined variable be redefined by a case- or let-expression. 



Duplication and Equality One of our key observations [8] was that duplica- 
tion and equality testing are two sides of the same coin in program inversion: 
the duplication of a value in a program becomes an equality test in the inverse 
program, and vice versa. To simplify inversion, we introduce a primitive function 
[•J defined as follows: 

^ We use the shorthand notation x\xs and [ ] for the constructors Cons(a;, xs) and 
Nil. For xi'.X 2 '. ■ ■ ■ ’.Xn:[ ] we write [2:1X2 . . . x„], or sometimes X1X2 . . . Xn- A tuple 
(xi, . . . , Xn) is a shorthand notation for an n-ary constructor Cn(xi, . . . , x„). 
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(v,v) 


(duplication) 


1 , /\ 1 def / {v) iiv = v' 

1 V ) 11 V ^ V 


(equality test) 



There are mainly two ways of using this function: duplication and equality test- 
ing. In the former case, given a single value, a pair with identical values is 
returned; in the latter case, given a pair of identical values, a single value is re- 
turned; otherwise the pair is returned unchanged. The advantage of this unusual 
function definition is that it makes it easier to deal with duplication and equality 
testing in program inversion. The function has a useful property, namely that it 
is self-inverse, which means it is its own inverse: = [-J . 

For example, in function len (Fig. 2) the equality of two adjacent symbols, 
c and d, is tested in the innermost case-expression. The assertion that those 
symbols are not equal is checked in forward and backward computation. 

3 Challenges to Program Inversion 

The most challenging point in program inversion is the inversion of conditionals 
(here, case-expressions). To calculate the input from a given output, we must 
know which of the m branches in a case-expression the source program took to 
produce that output, since only one of the branches was executed in the forward 
calculation (our language is deterministic). To make this choice in an inverse 
program, we must know m postconditions, Ri, one for each branch, such that 
for each pair of postconditions, we have: Ri A Rj = false (1 < * < J < m). 
This divides the set of output values into m disjoint sets, and we can choose the 
correct branch by testing the given output value using the postconditions. 
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Postconditions that are suitable for program inversion can be derived by 
hand {e.g., [6,11]). In automatic program inversion they must be inferred from 
a source program. The program inverter KEinv [7] uses a heuristic method, and 
the language in which its postconditions are expressed consists of the primitive 
predicates available for the source language’s value domain consisting of lists 
and integers. In general, there is no automatic method that would always find 
mutually exclusive postconditions, even if they exist. 

A nondeterministic choice is an unspecified choice from a number of alterna- 
tives. Not every choice will lead to a successful computation. If there is only one 
choice to choose from, then the computation is deterministic. 

In previous work [8, Sect. 4], we gave a local criteria for checking whether 
the choice in an inverse program will be deterministic. We viewed the body of 
a function as a tree with the head of the function definition as the root, and 
required that the expressions in the leaves of the tree return disjoint sets of 
values. For example, the expressions in the two leaves of function pack are ([]) 
and Clearly, both represent disjoint sets of values. This works surprisingly 
well for a class of programs, but is too restricted for other cases. For example, 
consider function len (Fig. 2). It has three leaf expressions each of which returns 
two values (a symbol-number pair and the remaining list of symbols): 

1. ((c,n),[j) 2. (p,f) 3. ((e,n),/:r) 

Our local criterion can distinguish the set of values returned by leaf (1) and (3), 
but it is not sufficient for leaf (2). The set of values represented by (2) is not 
disjoint from (I) and (3). In fact, it is the union of (I) and (3). 

This paper deals with this limitation of our previous method by applying 
methods from LR parsing to determine whether the choice is deterministic and to 
generate a recursive inverse program. These techniques replace our local criteria. 
As we shall see, a deterministic inverse program pack~^ can be derived from pack 
by the method introduced in this paper. 



Dead Variables Another problematic point in program inversion is when input 
values are discarded. Consider the selection function first defined by 

first(x) = case x of h:t — *■ h 

When we invert such a program, we have to guess ‘lost values’ (here, a value for 
t). In general, there are infinitely many possible guesses. We adopted a straight- 
forward solution which we call the “preservation of values” requirement. For a 
program well-formed for inversion, we also require that each defined variable 
be used exactly once. Thus, a variable’s value is always part of the output, and 
the only way to “diminish the amount of output” is to reduce pairs of values 
into singletons by [•]. For example, we write case [(a;,y)J of {z) ... . This 

expression ensures that no information is lost because all values need to be iden- 
tical. 
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q 


:= di ...dn 


(program) 


d 


f ^ tf . . . tn 


(definition) 


t 


~ in(a;i, . . . ,Xn) 


(input) 




1 out(yi,...,t/„) 


(output) 




1 c{xi,. ..,Xn)=y 


(constructor) 




1 x=c{yi,. . . ,Vn) 


(pattern matching) 




1 \x\=y 


(duplication / equality) 




1 f{xi,...,xr,)={yi,...,ym) 


(function call) 



Fig. 4. Abstract syntax of the symmetric language 

4 A Method for Program Inversion 

We now present a method for the automatic inversion of programs that are well- 
formed for inversion. Our method uses symmetric representation of a program 
as internal representation for inverting primitive operators and a grammar rep- 
resentation for eliminating nondeterminism. It consists of translations SYM| • ], 
GRAM\ ■ ] and FCT| • ] that translate from the source language to the internal 
representation and vice versa. Local inversion JIVV| • ] is then performed by 
backward reading of the symmetric representation. Finally, DFT| • ] attempts 
to eliminate nondeterminism by LR parsing techniques. To simplify inversion 
of a program, inversion is carried out on a symmetric representation, rather 
than on the source program. We now give its definition and explain each of its 
components in the remainder of this section. 

Definition 1 (program inverter). Let q be a program well-formed for inver- 
sion. T/ien program inverter l’]”^ is defined by 

=' FCTI DFT[ GRAMl JiVV| SYM[ g 1 1 1 1 1 



4.1 Translation to the Symmetric Language 

The translation of a function to a symmetric representation makes it easier to 
invert the function. During the translation, each construct is decomposed into a 
sequence of atomic operations. The syntax of the symmetric language is shown 
in Fig. 4. An atomic operation t is either a construct that marks several variables 
as input in(a;i, . . . , or as output out(j/i, . . . , ?/„), an equality representing a 
constructor application c(a;i, . . . , Xn)=y, a pattern matching x=c{yi , . . . , y„), an 
operator application [x\=y, or a function call f{xi, . . . ,Xn)={yi, ■ ■ ■ ,ym)- As a 
convention, the left-hand side of an equation is defined only in terms of input 
variables (here x) and the right-hand side is defined only in terms of output 
variables (here y). The intended forward reading of a sequence of equalities is 
from left to right; the backward reading will be from right to left. A function is 
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represented by one or more linear sequences of atomic operations. If a match op- 
eration in a sequence fails, the next sequence is tried. For instance, examine the 
result of translating function pack into the symmetric representation in Fig. 12. 
The translation is defined in Fig. 5. Function symt| • ] performs a recursive 
decent over t expressions until it reaches a return expression; function syml| • ] 
translates I expressions. The translation fixes an evaluation order when trans- 
lating expressions with multiple arguments (other orders are possible). Notation 
X denotes a fresh variable; they act as liaison variables. 

Definition 2 (frontend). Let d he a definition in a program well-formed for 
inversion. Then, the translation from the functional language to the symmetric 
language is defined by 



dGq 



4.2 Local Inversion of a Symmetric Program 

Operations in the symmetric representation are easily inverted by reading the 
intended meaning backwards. Every construct in the symmetric language has 
an inverse construct. Each function definition is inverted separately. The idea of 
inverting programs by ‘backward reading’ is not new and can be found in [6,11]. 
The rules for our symmetric representation are shown in Fig. 6. Global inversion 
of a program at this stage is based on the local invertibility of atomic operations. 

The inverse of in(xi, . . . , Xn) is out(xi, . . . , Xn) and vice versa; the inverse of 
constructor application c(xi , . . . , Xn)=y is pattern matching y=c{xi , . . . , Xn) and 
vice versa; the inverse of function call f{xi,...,Xn)={yi,--.,ym) is 
f~^{yi, ■ ■ ■ ,ym)={xi, . . . ,Xn). As explained in Sect. 2, primitive function [•] is 
its own inverse. Thus, the inverse of \ x\ =y is [j/J =x. Observe that the inversion 
performs no unfold/fold on functions. It terminates on all programs. 
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lnv[ f ^ tl . . .tn 1 


= f~^ invl G 1 • --invl ti J 


invl in(a;i, . ..,Xn) ] 


= out(a;i, . . . ,Xn) 


iuv| out(a;i, . . . ,a:„) 1 


= in(xi, . 


invl c(xi, . . -,Xn)=y ] 


= y=c{xi,. ..,Xn) 


invl x=c{yi,...,y„) ] 


= c{yi,...,yn)=x 


invl [x\=y 1 


= [y\=x 


invlf{xi,...,Xr,)={yi,. 


• • , 2/m) 1 “ / {yit • ■ • — . . . ,Xn) 



Fig. 6. Rules for local inversion 

The result of backward reading the symmetric representation of pack is shown 
in Fig. 12. Compare pack before and after the inversion. Each atomic operation is 
inverted according to the rules in Fig. 6. Program pacfc“^ is inverse to pack, but 
nondeterministic. We cannot translate len~^ directly into a functional program 
since the call len~^{p, t)={c, z, r) is not guarded by pattern matching — the reader 
is welcome to try. 

Definition 3 (local inversion). Let q be a symmetric program well-formed for 
inversion. Then, local inversion of q is defined by 

INVl q 1 =' {Jnv[ dj\deq} 



4.3 Translation to the Grammar Language 

After the inversion of atomic operations, nondeterminism can be eliminated by 
viewing the program as a grammar. To make it easier to manipulate programs, 
we hide variables by translating them into a grammar-like language. That lan- 
guage operates on a stack instead of an environment. Each atomic operation in 
the symmetric language is converted into a sequence of stack operations. The 
syntax of the grammar language is shown in Fig. 7. A stack operation t is ei- 
ther a constructor application cl, a pattern matching c?, an application of [_J , a 
function call /, or a selection (ii, . . . , i„). Each stack operation operates on top 
of the stack for input/output of the corresponding number of values, except for 
selection which moves each ijth stack element to the jth position on the stack. 
This is convenient for reordering the stack. For instance, the sequence (2) 
swaps the two top-most values and, if the new top- most value is a cons, pops it 
and pushes its head and tail components; otherwise the sequence fails. The result 
of translating pack from the symmetric language into the grammar language is 
shown in Fig. 12. The translation is defined in Fig. 8 where “ -ff ” appends two 
lists. 

Definition 4 (midend). Let d be a definition in a program well-formed for 
inversion. Then, the translation from the symmetric language to the grammar 
language is defined by 
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q:~ di .. . dn 
d f ^ U . . .tn 
t ::= cl 



I cr 

I U 
I / 

I (^1, • • • ; fn) 



(program) 

(definition) 

(constructor) 

(pattern matching) 
(duplication / equality) 
(function call) 
(selection) 



Fig. 7. Abstract syntax of the grammar language 



Gram[ / — > in(xs) ts j = / — > gram[ ts, xs ] 
gram[ out(ss), ss ] = e 

gram| c{xs)=y ts, 2 :s ] = (A) cl gram[ ts, y:zs\xs ] 

gram[ x=c{ys) ts, zs ] = (i) c? gram[ ts, ys -H- zs\x 1 

gram| Yx\=y ts, zs ] = (i) [_J gram[ ts, y.zs\:^ ] 



= e 




Notation: given xs and zs, {is) is an abbreviation for selection {ii,...,i„) where 
number ij is the index of Xj in zs for 1 < j < u; in particular, i is the index of x in 
zs; notation zs\xs denotes the deletion of all xs in zs. 



Fig. 8. Translation from the symmetric language to the grammar language 



4.4 Eliminating Nondeterminism 

An LR(k) parser generator produces a deterministic parser given a context free 
grammar, provided that the grammar is LR(k). This class of parsing methods 
is used in practically all parser generators {e.g., yacc) because it allows to parse 
most programming language grammars. Our goal is to eliminate nondeterminism 
from an inverse program. For this we will resort to the particular method of 
LR(0) parsing. This parsing method is simpler than LR(1) parsing in that it 
does not require the use of a lookahead operation in the generated parsers. 

We found that LR(0) covers a large class of inverse programs. For example, a 
tail-recursive program can be viewed as a right-recursive grammar; the recursive 
call is at the end. Local inversion of a tail-recursive program always leads to 
an inverse program that corresponds to a left-recursive grammar, the recursive 



GRAM\ q ] A {Gram| d ] | d G g} 
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call is now at the beginning. Immediately, we face the problem of nondetermin- 
ism because it represents an unguarded choice between immediately choosing 
the recursive call or the base case. Such a program cannot be represented in a 
functional language. This requires a transformation into a functionally equiva- 
lent form where each choice is guarded by a conditional (see also Sect. 3). LR(0) 
parsing allows us to deal directly with this type of grammars and, in many cases, 
to convert the program into a deterministic version. 

This is a main motivation for applying the method of LR(0) parsing, namely 
to derive deterministic inverse programs. Our method makes use of some of the 
methods of classical LR(0) parser generation, for example the construction of 
item sets by a closure operation, but generates a functional program instead of 
a table- or program-driven parser: 

1. Item sets: given the grammar representation of a program, the items sets are 
computed by a closure operation. 

2. Code generation: given conflict-free item sets, a deterministic functional pro- 
gram is generated. 

We will now discuss these operations in more detail. We assume that the reader 
is familiar with the main principles of LR parsing, e.g., as presented in [3]. Due to 
space limitations we cannot further review LR parsing and use standard termi- 
nology without further definitions (e.^., item set, closure operation, shift/reduce 
action). We show how these operations are adopted to our grammar language. 

Remark: In our previous work [8, Sect. 4], we applied a local criterion to 
a source program to ensures that the inverse program corresponds to an LL 
grammar. Since LL parsing is strictly weaker than LR parsing, we conclude 
that applying an LR parsing approach to program inversion leads to a strictly 
stronger inversion algorithm. Recall that LL parsing cannot directly deal with 
left-recursive grammars and that any LL grammar can be parsed by an LR 
parser, but not vice versa. 



Item Sets We define a parse item of the grammar language (Fig. 7) by 

/ — > tSl • tS2 

where denotes the current position. To compute the sets of items sets, we 
define two operations which correspond to determining the parse actions: shift 
I I' from item set I to item set I' under symbol t and reduce I ^ f from 
item set / by function symbol / and number of operations n. 

h h h = {f ^ tsi t • ts 2 \ f ^ tsi 't ts 2 G dosure| I\ ]} (shift) 

I ^ f f ^ ti . . .tn • G dosure| I ] (reduce) 



Given an initial item set Iq, the set X of all reachable item sets is defined by 

X = { J I Jo ^ /} 
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where Ii I 2 Ii = I 2 V 3t 31' . Ii I' A I' l 2 - For our running 

example, several selected item sets are listed in Fig. 9. 

As known from LR parsing, some item sets may be inadequate, that is, they 
contain a shift/reduce or a reduce/reduce conflict. In addition to these two clas- 
sical conflicts, we have a conflict which is specific to our problem domain (a 
shift/shift conflict): only pattern matching operations are semantically signifi- 
cant wrt to the choice of alternatives; while other operations do not contribute 
to such a choice. Both shift must pass over different matching operations. With 
Match we denote the set of all matching operations c? in the grammar language. 
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Det[ li, is ] 


= {fi-.is — » a ctxtl li, Ij, is ] 7, 




gen[ li, is ] 


= a ctxtl A, Ij, is 1 


if5i = {(a,7,)} 


gen[ li, is ] 


~ fi:is 


if5iD{(a,7,)} 


gen[ li, is ] 


= cutl li, is, f, n ] 


if A A / 


ctxtl lo, Ij, is 1 


= genl Ij, is ] 




ctxtl li, Ij, is 1 


= genl h, Ml cut[ h, i:is, f, n 


1 ifi?, ={(/,n)} 


ctxt[ li, Ij, is 1 


= genl Ij, his ] 


otherwise 


cut[ li, [ii,...,im], f, n 1 


= e 


if m < n 


CUt| li,, , . . . , Ztt, , . . . , 5 f 5 


72 1 Ctxt| Ij,, im] ] 


if A„ -i* Ij 


where 


Si = {{a,Ij)\Ii,II.Ij} 






Ri = {(f,n) \ f ^ tl . . .tn- ts 


€ li} 



Fig. 10. Code generation 



I I' A I ^ f (shift/reduce) 

/ ^ /i A / 3 /2 A (/i, ni) 7 ^ (/ 2 , ri 2 ) (reduce/reduce) 

I Ii A I ^ I 2 Ati t 2 A {ti,t 2 } ^ Match (shift/shift) 

Code Generation Given the shift and reduce relations, we now define the code 
generation. Code generation is only applied if all sets of items are conflict-free. 
Instead of generating a table- or procedure-driven parser, we generate a program 
in our grammar representation, which will then be converted into a functional 
program. The main task of the code generation is to produce for each item set 
a new function definition in the grammar language. The algorithm makes use 
of the shift and reduce relations for the given grammar program. It compresses 
redundant transitions between calls on the fly. 

Fig. 12 shows the result for our running example. Inversion is successful. Fi- 
nally, the grammar representation is translated into a syntactically correct func- 
tional program. This translation (not defined here) reintroduces variables and 
converts each operation into functional language construct. It also determines 
the arity and coarity of functions. This representation is easier to read, but less 
easy to manipulate. The inverse program pack~^ is shown in Fig. 3. We have 
automatically produced an unpack function from a pack function. For instance, 
to unpack a packed symbol list: pacfc“^([(A, 2)(B, 1)(C, 3)]) = [AABCCC]. 

For simplicity, we assume that all item sets can be identified by a unique 
index (/i, I 2 , etc.). These indices will be used to generate new function names 
and tell us about the context of a function call. For each item set A we compute 
a ‘Shift’ set Si and a ‘Reduce’ set Ri. Set Si tells us the item set Ij to which 
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we reach by performing operation a; set Ri tells us the names / of the functions 
used in an item set and the number n of operations passed. Functions geu| • ] 
and ctxtl ■ ] make use of these sets. They are defined in Fig. 10; the R and S 
sets for our running example are shown in Fig. 11. 

Definition 5 (backend). Let q he a grammar program and /q be the initial 
item set for q. Then, the generation of a deterministic program for a (possibly) 
nondeterministic program q is defined by 

DETl q 1 DERI Det{ /q, [ ] ] ] 

def \q if q' = q 

~[DETlq'j ifq'^q 

where q' = Det{ li, is ] \J q 

Sl.iB **'69 
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1) Function-to-symmetric translation: 
pack in(s), s=[], []=*, out(a;) 

pack—>in{s), s=c:r, 0=x, len{c,x,r)={p,l), pack{l) = {y), p-.y=z, out(z) 
len —y in{c,n, s), s=[], {c,n)=x, []=y, out(a;,j/) 
len —y in(c, n, s), s=d\r, {c,d)=x, [x\=y, 

y=(c), S{n)=z, len{c,z,r)={p,t), out{p,t) 
len —y in{c,n,s), s=d\r, {c,d)=x, [x\=y, 

y=(e,/), {e,n)=z, f:r=w, out{z,w) 

2) Local inversion: 

pack~^ — > in(a;), x=[], []=s, out(s) 

pack~^ 111(2), z=p:y, pack~^ (y)={l), len~^{p,l)={c,x,r), x=0, c\r=s, out(s) 
len~^ ^ m{x,y), y=[], x={c,n), []=s, out(c,n,s) 
len~^ ^ in{p,t), len~^ {p,t)={c, z,r), z=S{n), {c)=y, 

[y\=x, x=(c,d), d:r=s, out(c, n, s) 
len~^ in(z,w), w=f:r, z={e,n), {e, f)=y 
\_y\=x, x=(c,d), d:r=s, out(c, n, s) 

3) Symmetric-to-grammar translation: 

pack-^ ^ ( 1 ) []? 0 []! ( 1 ) 

pack~^ (1) _:_? (2) pack~^ (2,1) len~^ (2) O? (1,2) _:_! (1) 

Zen-i^(2) []? (1) {.,.)? 0 []! (2,3,1) 

Zen-i ^ Zen-i (2) S? (2) {.)! (1) [.J (1) {.,.)? (2,4) .:.! (2,3, 1) 

Zen-i ^ (2) .:.? (3) (.,.)? (1,3) (.,.)! (1) [-J (1) (-,-)? (2.4) .:.! (2,3,1) 

4) Elimination of non-determinism: 

/[o] ^ (1) /[l] 

/[i] - []? 0 []! (1) 

/[i] ^ .:.? (2) (1) f^^ (2, 1) (2) /p6] (2) /[n,io,9] 

/[ii.io, 9 ] ^ O? (1,2) .:.! (1) 

/[ii.io,9] - S? (2) {.)! (1) L-J (1) (-,-)? (2,4) .:.! (2,3,1) (2) /[n,io,9] 

/p6] -[]?(!)(-,-)?()[]! (2,3,1) 

/p6] - (3) (-, -)? (1. 3) (-, -)! (1) L-J (1) (-, -)? (2, 4) -:J (2, 3, 1) 



Fig. 12 . Inversion of program pack 
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The transformation into a deterministic grammar by D_ET| • ] does not terminate 
iff there exists a loop, Ij such that all sets Rk of R in this loop contain two 

or more elements. With another transformation that introduces some adminis- 
trative code, even these programs for which our algorithm does not terminate 
can be converted into a deterministic grammar program. Our goal was to avoid 
the introduction of administrative overhead and we found that our transforma- 
tion is successful for many programs. We omit the definition of FCT| • ] which 
translates a grammar program back into a functional program. 



5 Related Work 

The method presented in this paper is based on the principle of global in- 
version based on local invertibility [6,11]. The work was originally inspired by 
KEinv [13,7]. In contrast to KEinv, our method can successfully deal with equal- 
ity and duplication of variables. Most studies on functional languages and pro- 
gram inversion have involved program inversion by hand {e.g., [14]). They may 
be more powerful at the price of automation. This is the usual trade-off. Inver- 
sion based on Refal graphs [16,10,17,15] is related to the present method in that 
both use atomic operations for inversion. An algorithm for inverse computation 
can be found in [1,2]. It performs inverse computation also on programs that are 
not injective; it does not produce inverse programs but performs the inversion 
of a program interpretively. 

6 Conclusion 

We presented an automatic method for deriving deterministic inverse programs 
by adopting techniques known from LR parsing, in particular, LR(0) parsing. We 
formalized the transformation and illustrated it with an example. This greatly 
expands the application of our recent method for program inversion [8] by elim- 
inating nondeterminism from inverse programs by a global transformation. This 
allows us to invert programs for which this was not possible before. For exam- 
ple, the method in this paper can invert function tailcons and the tail-recursive 
version of function reverse [8, Sect. 6]. 

We have also reached the border line where more inverse programs can be 
made deterministic, but for the price of introducing additional administrative 
overhead or the use of LR(k), fc > 0, that is, parsing methods that involve 
lookahead operations. It will be a task for future work to study the relative gains 
by adopting such techniques. We used a grammar-like program representation. 
Other representations are possible and future work will need to identify which 
representation is most suitable for eliminating nondeterminism. 

Acknowledgements We are grateful to the anonymous reviewers for their detailed 
and useful feedback. 
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